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Chapter 1 


Introduction: Thermodynamic 


States as Ensembles 


This chapter introduces a number of notations and sets the background to 


the present book. 


1.1 A few preliminaries 


1.1.1 Classical and quantum statistical mechanics 


Statistical mechanics is the study of large systems, and of subsystems of 


large systems, in terms of probabilities. 


From a fundamental point of view, such a study is to be based on quan- 
tum mechanics, since quantum theory is considered to be of ultimate 
relevance in describing and explaining the properties and interactions of 
matter. However, a large body of observations is more conveniently and 


fruitfully explained in terms of the classical theory. The latter is related 
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to the quantum theory in a certain limiting sense which, however, is not 
of a simple kind. In any given context of observations, the range of valid- 
ity of the classical theory is defined by means of a number of appropriate 


conditions involving a set of parameters relevant to the context. 


Accordingly, classical and quantum statistical mechanics constitute two 
distinct, though related, areas of study, to be addressed in this book. 
Of these, the former can be built up independently of the latter, starting 
from a number of postulates, but the range of validity of the consequences 
derived in the resulting theory in any given context is to be determined 
by referring to quantum mechanical principles. Quantum statistical me- 
chanics can also be built up by starting from a number of analogous prin- 
ciples, when results obtained in the classical formulation can be found to 


emerge in a limiting sense in the context of specific problems. 


In keeping with what has been mentioned above, the classical and the 
quantum theories will be developed in this book in parallel, where these 
two will be seen to have analogous conceptual frameworks. Once the ba- 
sic principles are formulated, however, the two diverge greatly from each 
other, both in the specific problems addressed and in the technicalities of 


arriving at meaningful results. 


1.1.2 The ‘climb-up’ and the ‘slide-down’ 


Richard Feynman, in his widely acclaimed book on statistical mechanics 
based on a series of lectures [36], begins by writing down one single fun- 


damental formula of (quantum) statistical mechanics (formula (2-15a), 
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read with (2-15b), in chapter 2), and refers to it as the ‘summit’, while 
the whole of the rest of the subject consists of either a ‘climb-up’ to the 


summit or a ‘slide-down’ from it. 


The climb-up consists of attempts at justifying and explaining the funda- 
mental formula by means of more familiar principles, or ones that can be 
related to more basic considerations, and at elaborating the conditions 
under which the formula acquires relevance. For instance, one would 
like to see how the fundamental formula relates to the basic principles of 
quantum theory (or of classical theory as the case may be). This, however, 
is no easy matter since statistical mechanics essentially involves proba- 
bilistic considerations because of the largeness of the systems it studies. 
Any attempt at seamlessly tying up the basic principles of classical or 
quantum theory with probabilistic considerations necessary to describe 
large systems, is seen to throw up problems of great complexity, where 
one has to be careful to specify the conditions under which the results 
arrived at from the various formulations of the theory (there are, as we 


will see, several of those) can be accepted as being meaningful. 


The slide-down, on the other hand, involves a great many concrete appli- 
cations of the fundamental formula (the ‘summit’; as mentioned above, 
there are, in fact, several fundamental formulas, depending on which of 
several alternative formulations one chooses to work in) to specific classes 
of problems relating to systems of interest, whereby consequences are 
derived that can be tested against actual observations. In this, the fun- 
damental formulas (in the quantum as also in the classical formulation) 


have been amply vindicated, whereby statistical mechanics has developed 
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into a greatly fertile field of study. 


1.1.3 Equilibrium and non-equilibrium statistical mechan- 
ics 

The subject of statistical mechanics is divided into two distinct areas: 

equilibrium and non-equilibrium statistical mechanics. The former deals 

with matter in stationary (i.e., time-invariant) macroscopic states in the 


absence of fluxes, and the latter with time-dependent states, and station- 


ary states with non-zero fluxes. 


The fundamental formula referred to by Feynman (or alternative but 
equivalent versions of it that we will also have a look at) defines the 
subject of equilibrium statistical mechanics. Non-equilibrium statisti- 
cal mechanics, in contrast, does not have an analogous central formula 
because it is of an enormous (almost limitless) expanse, and it is quite 
unlikely that the entire field can be captured in one or a few principles 
of central relevance. The principle of increase of entropy describes a com- 
mon feature of non-equilibrium processes, though it does not have the 
explanatory power analogous to the central principle of equilibrium sta- 


tistical mechanics, the ‘summit’ referred to by Feynman. 
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1.2 Statistical mechanics and thermodynam- 
ics 


1.2.1 Macroscopic systems: thermodynamic states and 


processes 


A clear and masterly exposition of equilibrium and non-equilibrium ther- 


modynamics is to be found in [15] 


One major concern of statistical mechanics is to provide an explana- 
tory background for the principles of thermodynamics where, once again, 
there arises the necessity to distinguish between equilibrium and non- 
equilibrium thermodynamics, though it is the former that is commonly 
referred to by the term ‘thermodynamics’ - a practice that we will mostly 


conform to in this book. 


Thermodynamics enunciates the general principles governing the exchange 
of energy and matter between any given macroscopic (or ‘thermodynamic’) 
system and other systems with which it may happen to interact. The 
mode of exchange of matter and energy is conditioned by one or more 
constraints, depending on which the system under consideration may, in 
the course of time, attain a state of equilibrium or else, may more gen- 
erally be described in terms of a non-equilibrium ‘state’ (the term state, 


though, is commonly used to depict an equilibrium configuration). 


The fact that the systems considered in thermodynamics are macroscopic 
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ones, without regard to their microscopic constitution, brings up the 
necessity of justification of the general principles on which the science 
of thermodynamics is built up, which is precisely where statistical me- 
chanics becomes relevant, because it is specifically designed to look at 
systems made up of a large number of constituents. It is found that 
a system made up of a large number of identical constituents can be 
described in terms of a relatively small number of macroscopic (or ‘ther- 
modynamic’) variables. For an equilibrium state of the system, all these 
macroscopic variables attain stationary or time-independent values de- 
pending on the relevant constraints, where it is required that the latter 
be time-independent ones. Statistical mechanics tells us how the values 
of these thermodynamic variables can be related to the constitution of 
the system under consideration, i.e., on the mutual interaction among 
its constituents. Interestingly, the fact that the system is made up of 
a large number of the constituents obviates the necessity of solving for 
the details of their dynamics, and allows us to describe the macroscopic 


behavior of the system in probabilistic terms. 


Under non-equilibrium conditions, the description of the behavior of a 
thermodynamic system assumes a relatively simple form if the system 
happens to be sufficiently close to an equilibrium state (in a sense that 
can be made precise), when the latter can be made use of as a reference 
state. In this case, one again has a set of general principles in terms of 
which its behavior can be described. Interestingly, such a description can 
be obtained in terms of a set of equilibrium parameters such as the spe- 


cific heat, the thermal conductivity, and the diffusion coefficient, in ad- 
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dition to the boundary conditions that determine how the system under 
consideration exchanges energy and matter with other systems around 
it. For more general non-equilibrium situations the system behavior gets 
to be more complex, and the explanatory power of statistical mechanics 


becomes correspondingly limited. 


1.2.2 Exchange of energy and matter: enclosures and 


reservoirs 


A thermodynamic system typically exchanges energy and matter with 
other systems through its boundary surface. The mode of exchange of 
energy and matter is often conditioned by walls or enclosures of specific 
types, through which such exchange takes place. For instance, an adi- 
abatic wall prevents the exchange of energy in the form of heat, while a 
diathermic wall allows heat transfer to take place. In addition, the system 
may be enclosed within a semi-permeable wall which allows exchange of 
matter of certain types through it, while blocking other types of material 


constituents. 


The term heat refers to a form of exchange of energy that takes place by 
virtue of a difference of temperature. The idea of temperature is provided 
by the ‘zeroth law’ of thermodynamics which defines the concept of ther- 
modynamic equilibrium. Other than heat, energy exchange can also take 
place in the form of mechanical work, and by virtue of exchange of matter, 
the elementary constituents of which are molecules. Generally speaking, 
the work done on the system under consideration in any given process can 


be expressed in terms of parameters pertaining to the systems that perform 
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work on it (the work performed by the system is the negative of work done 
on it). However, in the case of a quasi-static process, the work can be ex- 
pressed in terms of parameters pertaining to the system itself. The energy 
exchange by means of exchange of molecules can be termed chemical work, 


while electrical and magnetic work are also possible. 


In this book, the term ‘diathermic wall’ will be used to mean an enclosure 
that allows energy exchange only in the form of heat, while the term 
‘adiabatic’ wall will mean an enclosure that allows energy exchange only 
in the form of mechanical work. A semi-permeable wall is one that allows 
the exchange of molecules of some particular type(s) through it. At times, 
the term ‘adiabatic wall’ is used in an extended sense to mean one that 
allows exchange of mechanical and chemical energy, the latter by means 
of molecules of specified types. Finally, the term isolating wall will stand 
for an enclosure that prevents energy and matter exchange in all forms 


with external systems. 


The type of energy and matter exchange, if any, taking place between a given 
system and other systems around it will, in general, have to be read from 
the context since the type of enclosure specifying the boundary condition on 
the system will not be mentioned explicitly. At times, the exchange of energy 
and matter of some desired type can be achieved with a wall enclosing the 


system only partially, or even without a material enclosure. 


A thermodynamic system may be a composite one, being made up of 


more than one subsystems, each a thermodynamic system in its own 
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right, that may exchange matter and energy among themselves, while 
exchanges may also occur between the composite system and other sur- 
rounding systems. In statistical mechanics, one also considers one or 
more small subsystems of a thermodynamic system, where the former 


need not qualify as thermodynamic systems. 


The system(s) with which a given thermodynamic system exchanges mat- 
ter and energy will often be assumed to be in the nature of reservoirs, 
these being thermodynamic systems of such large size that they con- 
tinue to be in equilibrium states even as the system of interest exchanges 
matter and energy with these. This is evidently an idealization, but one 
that is useful in enunciating the basic principles of thermodynamics and 
statistical mechanics. Reservoirs may be of various different types. For 
instance, a heat reservoir (or a ‘thermal reservoir), characterized by some 
given temperature, exchanges energy in the form of heat, while a particle 
reservoir exchanges molecules of some specified type, being characterized 


by a given chemical potential. 


1.2.3 The first and second laws of thermodynamics 


Considering a process connecting any two given equilibrium states of a 
system, depicted by points ‘i’ and ‘f in a thermodynamic diagram (at times 
referred to as a thermodynamic state space) in fig. 1-1, one can write the 
energy balance equation for the process, referred to as the first law of 


thermodynamics, in the form 


Q=AU+W, (1-1a) 
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where @ stands for the heat absorbed by the system, W for the work 
performed by the system, and AU for the increase in its internal energy 
(U), measured by the work performed on it in an adiabatic process taking 


it from ‘i’ to ‘f. 


There always exists an adiabatic process connecting the two states ‘i’ and 
‘f that may be made to occur in at least one of the two possible directions 
(i.e., either from ‘i’ to‘f or from ‘f to ‘i’)), where the process may or may not 
be reversible. If an adiabatic process taking the system from ‘i’ to ‘f does 
not exist, then W is to be defined as the negative of the work done on the 


system in an adiabatic process from ‘f to ‘i’. 


Here and in the following we will mostly consider, for the sake of simplicity, a 
one-component fluid as our system of interest and, unless otherwise stated, 
the mole number will be assumed to be fixed. The work W, in that case, will 


be simply the mechanical work performed by the fluid. 


If the states ‘i’ and ‘f be infinitesimally close to each other, then the above 


formula can be written as 


T5S = 6U + pov, (1-1b) 


where the second law of thermodynamics has been made use of, with S 
denoting the entropy of the system, while T,p and V stand for the ab- 
solute temperature, pressure, and volume of the system (which we have 
assumed, for the sake of simplicity, to be a fluid with a fixed mole num- 


ber). The second law of thermodynamics implies that, for any process 
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occurring in an isolated system, its increase in entropy AS satisfies 


AS > 0. (1-1c) 


l. For a fluid with a variable mole number, the formula (1-1b) assumes 


the more general form 


TOS = 0U + poV — fidv, (1-2) 


where v stands for the mole number, and ji for the chemical potential 


per mole. 


2. The symbol 6 is commonly used to denote a change in the value of a 
quantity that can be made arbitrarily small, while A is used to denote 


a change by a finite amount. 


P 


Figure 1-1: Depicting a process in a thermodynamic state space; points ‘i’ and ‘f 
represent two equilibrium states of a system; the solid line joining the two points repre- 
sents a quasi-static process from the initial state ‘i’ to the final state ‘f; the dotted line 
represents an irreversible process which, strictly speaking, cannot be depicted in the 
thermodynamic diagram; for the sake of simplicity, we consider a one-component fluid 
with a fixed mole number as the system of interest, for which p, V stand for its pressure 
and volume respectively. 
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1.2.4 Equilibrium states: descriptions in terms of alter- 


native sets of variables 


An equilibrium state (or, simply, a state) is characterized by well defined 
values of all its thermodynamic variables, not all of which are indepen- 
dent, and one can choose a basic set of independent variables in terms 
of which all the other thermodynamic variables can be expressed. For 
instance, in the case of a one component fluid one can choose U and 
V as the fundamental variables , in terms of which one can express its 
entropy S (as an example, the entropy of an ideal gas is given, in the clas- 
sical theory, by the expression (3-23), derived in chpter 2) in the form of 


a functional relation 


S=S(U,V,N), (1-3) 


where the number of molecules NV has been assumed above to be fixed 
but can be included as an additional variable in the description of the 
state (though a more appropriate thermodynamic variable would be the 
mole number v = *, where A stands for the Avogadro number). With S 
known in terms of the fundamental variables U,V, .N (note that we have 
now included N as a thermodynamic variable), one can invert the above 
relation and express U in terms of S,V, N (it is a fundamental fact in ther- 
modynamics that such an inversion is possible, i.e., the transformation 
from (U,V, N) to (S,V,N) is an allowed one). Starting, then, from either 
U,V, .N or S,V,N as the basic set of variables, one can obtain other ther- 


modynamic variables by appropriate partial differentiation. For instance, 
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the temperature is obtained as 


2 Os 
T t= (sa) va (1-4) 
and the pressure and chemical potential (per particle) as 
Os Os 
P=T( aS )ow p= -T (ay )uv cP) 


Incidentally, for U,V,N as our choice of the fundamental variables, S$ 
plays the role of a thermodynamic potential, as ordained by the second 
law of thermodynamics: for a system with fixed U,V, N (i.e., for an iso- 
lated system) the value of S is an extremum (actually, a maximum) as 
compared with other states with the same U,V, N, but with additional in- 
ternal constraints (for instance, in the case of an isolated mass of gas in 
a cylinder, a partition placed inside the latter - the pressure and temper- 
ature on the two sides of the partition may be different); as the internal 
constraints are removed (for instance, by removing the partition), the en- 


tropy increases or remains unchanged. 


Instead of choosing U,V, N (or, equivalently, S,V, NV, as the case may be), 
one can alternatively choose, say, 7,V,N as the fundamental set of vari- 


ables, in which case the Helmholtz free energy F’, defined as 


F=U-TS, (1-6) 


assumes the role of the thermodynamic potential (whose value is now a 


minimum as compared to states with the same values of T,V,N, along 
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with possible additional constraints), and all other thermodynamic vari- 


ables are obtained by appropriate partial differentiation. 


1.2.5 The thermodynamic limit: equivalence of alterna- 


tive descriptions 


Strictly speaking, thermodynamics makes sense only for infinitely large 
systems - ones, moreover, with a finite density. Thus, in the case of a one- 
component fluid, one has to consider the limit V > oo, N > «~, y > v, 
where the specific volume v has some well defined finite value (the mass 
density is related to v as p = +4, where M stands for the molar mass of the 
fluid). Moreover, the volume of the system cannot go to infinity in just any 
arbitrary manner: one has to ensure that the shape and surface area of 
the enclosing volume do not have any role in determining the properties of 
the system in the limit considered (otherwise, additional variables relating 


to the surface are to be included in the list of fundamental variables). 


Considerations along the lines indicated above are said to define the ther- 
modynamic limit for the system under consideration. In the case of a 
one-component fluid considered here for the sake of simplicity, all its 
thermodynamic variables in this limit will be functions of u(= &) , the 
specific energy, and v (or of 7,v in the alternative representation, where 


one starts from T,V,N). 


As V and N go to infinity in the thermodynamic limit, the internal en- 
ergy U also goes to infinity. This requires the mutual interaction (refer 


to sec. 1.3.2.3 below where the potential energy function is introduced in 
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the classical context) among the particles making up the system to satisfy 
certain conditions (see section 5.2), where these conditions ensure thermo- 
dynamic stability of the system. If the stability criteria are satisfied then 
the thermodynamic limit implies thermodynamic behavior which, in turn, 
means that the variables like the internal energy and entropy considered in 


thermodynamics are extensive ones. 


While actual macroscopic systems that are observed and studied exper- 
imentally are of finite, though large, size, the theory addressing the be- 
havior of these systems holds rigorously only in the thermodynamic limit. 
For most actual systems, the physical predictions of the theory agree with 
experimental observations so closely as not to be of any significance. In 
this book, we will consider the behavior of macroscopic systems and will 
attempt to set up a correspondence between results derived in statistical 
mechanics and those describing their thermodynamic behavior, where we 
will assume that the number of constituents of such a system is so large 
as to be sufficiently close to the thermodynamic limit without, however, 
stating precisely what the term ‘close’ means. It will be implied that the 
the difference between the actual behavior of such a large but finite sys- 
tem and the behavior predicted on the basis of the thermodynamic limit 
is small enough so as to be insignificant from the experimental point of 
view. It will then be said that the number of constituents of the system 


under consideration is large ‘in the sense of the thermodynamic limit’. 


As mentioned above, the thermodynamic behavior of macroscopic sys- 


tems presupposes their stability, where stability means that the basic 
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constituents of such a system do not either get concentrated within an 
infinitesimally small volume or disperse away from one another to in- 
finitely large distances. The stability problem of macroscopic systems 
can be addressed on a rigorous basis only in the thermodynamic limit 


(see section 5.2, chapter 5). 


At the same time, the thermodynamic limit assumes relevance in the 
context of equivalence between the various possible descriptions of the 


behavior of a system in terms of basic sets of thermodynamic variables. 


Let us consider, for the sake of concreteness, the two alternative sets of 
variables (U,V) and (T,V)for a pure gas in a cylinder, with a fixed value of 
N, where U,V, N are all large but finite (i.e., close to the thermodynamic 


limit). 


Operationally speaking, the specification of U and V requires that the gas 
be enclosed within isolating walls, i.e., the walls of the cylinder be made 
of rigid, non-permeable, and adiabatic material (one can change U,V as 
parameters, i.e., set these to different possible values, if required, by 
special arrangement). With fixed values of U,V (recall that NV has already 
been set at a fixed value), if one measures the temperature of the gas by 
inserting a thermometer, one will find that it registers some specific value 


of T to a good degree of approximation. 


If, now, the gas be kept in a cylinder with rigid, non-permeable, and 
diathermic walls, and thermal contact is established with a reservoir of 


known temperature T (the same as the temperature measured in the pre- 
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vious observation; the gas is assumed to be in equilibrium in all these 
considerations), then a measurement of its internal energy U (a direct 
and accurate measurement of U is not possible but, in the present con- 
text, we assume that such a measurement can be performed in principle) 
will reveal that the value of U is the same (again, to good degree of ap- 


proximation) as that in the previous part of the observation. 


One then infers that a description in terms of U,V is actually (i.e., from 


an operational point of view) equivalent to that in terms of T,V. 


However, if it were possible to execute the notional observations men- 
tioned above with arbitrarily high precision, it would be found that the 
said equivalence is offset by small but finite fluctuations. For instance 
in the measurement of T for fixed U,V (the first of the two observations 
mentioned above), one would observe small fluctuations in the measured 
value of T (this fluctuation being revealed in a large number of measure- 
ments performed under identical conditions), with a certain mean value. 
If, on the other hand, one fixes T at this mean value along with V (in the 
second observation considered above), then one would find fluctuations 
in the measured value of U, with a mean value close to the fixed value of 


U in the first observation. 


The fluctuations get diminished as one considers progressively large val- 
ues of V and N, and exact equivalence between the (U,V) and the (7,V) 
descriptions is established only in the thermodynamic limit outlined above. 
In this limit, the behavior of a gas (or, more generally, of a macroscopic 


system, referred described in terms of appropriate sets of variables) is 
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completely described either in terms of (u,v) or in terms of (T7,v), since 


the two descriptions then become strictly equivalent. 


In explaining the thermodynamic behavior of a system, where there ex- 
ist descriptions in terms of alternative sets of thermodynamic variables, 
one can adopt any one of several alternative approaches in statistical 
mechanics too (whether in the classical or the quantum context). Once 
again, all these alternative descriptions turn out to be equivalent in the 


thermodynamic limit. 


The stability of macroscopic systems and the equivalence of the various 
possible thermodynamic descriptions are two aspects of thermodynamic be- 
havior for which, strictly speaking, the thermodynamic limit constitutes a 
pre-condition. The problems of stability and equivalence in statistical me- 


chanics will engage our attention in sections 5.2 and 5.4. 


1.2.6 Phase transitions 


However, the equivalence may be of a very special nature under certain 
circumstances. Consider, for instance, a representation in terms of S,V 
for which U can be uniquely determined from the formula (1-3) by inver- 
sion (V can be included as a variable determining the functional form of 
S but, for the present discussion, N and V will be assumed to have fixed 
and arbitrarily large values, in keeping with the thermodynamic limit; re- 
call that thermodynamic behavior requires the thermodynamic limit as a 
pre-condition). As we have seen, one can adopt a representation where 


U is replaced with T and S is replaced with F’. From the mathematical 
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point of view, the transformation from (U,S) to (T, F) can be described as 
a Legendre transformation (for a clear exposition, see [15], chapter 5; as 
mentioned above, Callen’s is one of the great books in the field of ther- 


modynamics and statistical mechanics). 


For a state with any specified value of S (along with V, VN which have been 
assumed to be fixed from the beginning) one obtains a unique thermody- 
namic state with some specific value of U, where the values of F and T are 
also uniquely determined. However, there may be certain special values 
of Tsuch that, for any such value, say, T = T,, F can be obtained uniquely 
(say, F = Fy), but there does not correspond uniquely determined values 
of U,S for To, >. Instead, there corresponds a range of values of S and 
a corresponding range of values for U, with a linear relation between U 
and S (this makes the free energy fF = U — T)S possess the fixed value Fo 


throughout this range). 


In other words, while the specification of S (and hence of U; recall once 
again that V and Nhave already been fixed arbitrarily close to the thermo- 
dynamic limit) corresponds to a uniquely defined thermodynamic state , 
the same cannot be said of T and F since for a special value like T = Tp 
(and, accordingly, / = Fy) there correspond not one but a range of ther- 


modynamic states. 


From the mathematical point of view, this is not in conflict with the theory 
underlying Legendre transformations which tells us that, in the present 
context (where U is a convex function of 5), there may exist isolated val- 


ues of T (for specified values of V,N) such that, for any of these val- 
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ues, a single point in the graph of F against T (such as the point with 
T = To, F = Fo) may correspond to a linear stretch of points in the graph 
of U against S. What happens here is that the derivative oF possesses 
a jump discontinuity at (75, Fo), but the derivative ot possesses no such 
discontinuity anywhere in the linear stretch in the graph of U against S, 
including its end points, the value of the derivative being 7,’ through- 
out. However, given the graph of F against T one can unambiguously 
construct that of U against S, following the rules of obtaining the inverse 


Legendre transformation. 


The relation between U and S, and the corresponding relation between F' 
and 7, in the neighborhood of a phase transition is shown schematically 
in fig. 1-2 where, in (A), we find a linear stretch AB in the graph of U 
against S, corresponding to the coexistence of two phases (the parts CA 
and BD correspond to pure phases); in graph (B), showing the variation 
of F against T, this linear stretch corresponds to the single point E (with 
FE and EG of (B) corresponding, respectively, to CA and BD of (A)). The 
linear stretch AB in graph (A) corresponds to the addition of latent heat 


at a constant temperature To. 


From the physical point of view, this is symptomatic of a phase transition 
at temperature 7). The linear stretch of the graph of U against S corre- 
sponds to the coexistence of two phases where the relative abundance of 
the phases changes from point to point, with the two end points of the 


linear stretch representing the two pure phases. 


In the case considered here the phase transition is indicated by a dis- 
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F 


(A) 


Figure 1-2: Depicting the Legendre transformation from (A) the graph of U against 
S to that of (B) F against T; in (A), the parts CA and BD correspond to pure phases, 
while the linear stretch AB corresponds to coexistence of two phases where U increases 
linearly with S at a constant T(= Jo) due to the addition of latent heat; in (B), the 


parts FE and EG correspond to the pure phases, while F' remains constant at the value 


Fy for the entire range of states in which the two phases coexist; the derivative ye is 


discontinuous at (Jo, Fo); the left- and and right hand derivatives at this point can be 
made use of in reconstructing the stretch AB of graph (A). 


continuity of the first order derivative of the thermodynamic potential F’. 
More generally, a phase transition corresponds to a loss of smoothness 
of a thermodynamic potential at some particular temperature, with the 


other relevant thermodynamic variables held fixed. 


As we will see below (refer to section 5.4) the problem of equivalence of 
various possible representations of thermodynamic states of a system 
(related to one another by Legendre transformations) possesses a coun- 


terpart of quite considerable significance in statistical mechanics. 
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1.3 Pure and mixed states of systems 


1.3.1 Pure and mixed states: Introduction 


The basic principles of classical and quantum mechanics are commonly 


enunciated with reference to pure states of systems. 


We will start from the assumption that the concept of state of a system 
does not need further elaboration. However, the distinction between pure 
and mixed states will be relevant for our purpose, in the context of sta- 
tistical mechanics. A pure state is one whose specification is maximal in 
the sense that it tells us all that one can possibly know about the results 
of observations or measurements on the system, and no more precise 
information is possible about such results. Pure states in classical me- 
chanics are represented by points in the phase space of the system (see 
sec. 1.3.2.2 where the correspondence between pure states and points 
in the phase space is explained). A mixed state, on the other hand, is 
one where such a maximal specification is not possible and, instead, the 
state of the system can be thought as being a mixture of a number of 
pure states, with some particular set of probabilities associated with each 
of these pure states in the mixture. Analogous considerations differen- 
tiate the mixed states from pure ones in the quantum description (see 


sections 1.3.3, 1.3.6) 


Though the above description of pure and mixed states is not as precise 
as one would like it to be, it can be complemented with more well de- 


fined statements in mathematical terms. We will take up the question 
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of mathematical representation of pure and mixed states in the classical 
and quantum theories in subsequent sections in this chapter, where we 
will find that the description in quantum terms differs in kind from that 
in classical terms, though the fact remains that, in either case, a mixed 
state can be described as being constituted of a number of pure states, 


with specific probabilities associated with the latter. 


In statistical mechanics, a correspondence is set up between certain classes 
of mixed states of a macroscopic system and the states of that system de- 
scribed in thermodynamic terms. Such states are, at times, referred to 
as macrostates, in contrast to microstates that commonly refer to pure 
states of a macroscopic system, where a microstate requires an enor- 


mously large amount of information for its specification. 


A macroscopic system is made up of a large number (say, N) of con- 
stituents where, generally speaking, the term ‘large’ is to be interpreted 


in the sense of the thermodynamic limit mentioned above. 


However, in a formal sense, the basic formulas of statistical mechanics 
can be stated even without regard to any estimate of how large N should 
be. The question of largeness arises as one attempts to give a physical 
interpretation of these formulas, i.e., to establish correspondence with 


those of thermodynamics. 


In statistical mechanics, one considers, at times, subsystems of large sys- 
tems, where a subsystem need not be a large one. If, on the other hand, a 


subsystem is large in the sense of the thermodynamic limit, then its behav- 
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ior will also conform to a thermodynamic description. 


1.3.2 The classical description of systems: pure states 


1.3.2.1 Introducing the phase space 


In classical mechanics, the states of a system with s number of degrees 
of freedom can be represented in a phase space of 2s dimensions, made 
up of (generalized) position co-ordinates q,q2,--- ,q; and corresponding 
momentum co-ordinates p,,p2,--- ,ps. These two sets will, for the set of 


brevity, be denoted by the symbols {q} and {p} respectively. 


In later sections of the book, we will introduce a further change in notation, 
replacing the symbols {q},{p} for the sets of position and momentum co- 
ordinates with Q,P respectively (see sec. 2.2), reverting to {q},{p} when 
necessary for the sake of clarity. More specifically, Q,P will refer to the 
position and momentum variables considered collectively. Other notational 


improvisations will be explained as and when necessary. 


In the case of a system made up of NV number of point particles, one has 
s = 3N, i.e., the phase space is 6N dimensional. The co-ordinates gq; (i = 
1,2,--- ,s = 3N) are then commonly taken as the Cartesian components 
of the position vectors r, (a = 1,2,---N) with reference to any chosen co- 
ordinate system, and p; (i = 1,2,---,s = 3N) as the components of the 
corresponding momentum vectors p,. It is not essential here to explicitly 
refer to the correspondence between the sets {gq}, {p} and the Cartesian 


components of the position and momentum vectors of the particles. 
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1.3.2.2 Asystem of identical particles: the ‘mu-space’ and the ‘gamma- 


space’ 


The phase space of the system of N particles can be looked upon as a 
direct product (in the sense of set theory) of phase spaces corresponding 


to the individual particles. 


In order to fix our ideas, we consider, for the sake of simplicity, a system 
made up of just two particles (N = 2), each moving in one dimension (s = 
2), and assume, moreover, that the two particles are distinct from each 
other. Here the index 7 is used to distinguish between the two particles, 


with q;,p; (i = 1,2) denoting the position and momentum of the ith particle. 


The phase space of each individual particle is termed its ‘mu-space’. 
Thus, we have here two distinct mu-spaces, which we denote by symbols 
M”,M®), where M© (i = 1,2) is made up of phase space co-ordinates 
qi, pi. The phase space of the system (of two distinct particles) under con- 
sideration, referred to as the ‘gamma-space’ (we denote this as T) can 
then be represented as Tf = M“ @ M®) with phase space co-ordinates 
(q1; P1, 92, P2), where the first two of the four phase space co-ordinates refer 
to the particle labeled ‘1’ and the next two to the particle ‘2’ (however, 
the phase space co-ordinates of a point are commonly denoted with a 


re-ordered sequence as (q1, 92; P1, P2))- 


In statistical mechanics, we often consider systems of identical particles 
(or systems made up of groups of identical particles). This means that, 


in the case of two particles (each moving in one dimension) considered 
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above, the distinguishing labels ‘1’ and ‘2’ can be assigned only notionally 
and an interchange of the positions and momenta of the two particles 
does not give a new state where, for the time being, we refer to pure 


states, represented by points in the phase space. 


This is depicted in fig. 1-3 below where A and B represent (pure) states of 
the two particles in their respective mu-spaces M“), M°), corresponding 
to some point (call it C) in the product space [ = M @ M®). However, the 
point C’ in the product space (the product space, and the points C, C’ are 
not shown) resulting from points A’, B’ in the respective mu-spaces (the 
co-ordinates of A’ in M are the same as those of B in M°), while A, B’ 
are similarly related) corresponds to the same (pure) state of the system 
made up of the two particles, when they are identical. In this case, two 
distinct points in the space I’ represent one and the same state of the 
system. The actual state space of the system is I’, obtained by identifying 


pairs of points such as C,C’ in. 


Figure 1-3: Depicting the mu-spaces M“), M®) (schematic) of two particles, where A, 
B represent pure states of the particles, corresponding to a pure state (C; not shown) in 
the product space I (not shown); A’, B’ are another pair of states obtained from A, B by 
interchanging their co-ordinates, giving rise to the point C’ in I; if the two particles are 
identical, then two points C and C’ in I’ represent one and the same state of the system 
made up of these two; in this case the actual state space [ (not shown) is obtained by 
identifying pairs of points such as C, C’ inT. 
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Think of a system of two identical billiard balls moving on a billiards ta- 
ble. According to the point of view of classical mechanics, as also in our 
common-sense point of view, the two balls can be notionally distinguished 
as the ‘first’ and the ‘second’ ones. However, the instantaneous state of mo- 
tion of the system, represented in terms of two position co-ordinates (say, 
a, 8) and two corresponding momentum co-ordinates (say, €,7), does not de- 
pend on which of the two particles has position and momentum (a, €) and 


which one has position and momentum ((,7). In other words, the two points 


(4 = 0,92 = B3p1 = £,p2 =n) and (q = 8,q2 = a;p1 = 7, p2 = ) in the gamma 


space correspond to the same state of the system under consideration. 


Put differently, the state space is not the same as the ordered product 
M” @ M®) of the two mu-spaces with the distinguishing labels ‘1’ and ‘2’ 
attached to the two. This ordered product is only a notionally defined one 
and has been denoted above by the symbol I’. Each state of the system 
under consideration corresponds to two different points in I. The actual 
phase space (to be denoted by the symbol [ from now on), on the other 
hand, whose points have a one-to-one correspondence with the states of 
the system, is to be defined without regard to the ordering among the 


individual particles. 


Generalizing now to a system made up of N identical particles, the state 
space of the system is defined in classical mechanics in two steps. First, 
the particles are notionally given identifying labels ‘1’, ‘2,---,‘N’, with as- 
sociated mu-spaces M (i = 1,2,---,N). The space I is then defined, 


again notionally, as the ordered product space M‘) @M®).--@M“), Given 
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any point in this space, one can obtain other points by permutations of 
the position and momentum values corresponding to the various parti- 
cles (notionally distinguished from one another with labels ‘1’,‘2’,---,‘N’), 
and all the N! points in I obtained in this manner correspond to the same 
state of motion the system under consideration. The actual phase space 
I’, whose points are in one-to-one correspondence with the states of the 
system is obtained from I’ by identifying the set of N! points in this man- 
ner and invoking this process of identification for all such N!-tuples of 


points in TI. 


In the following, when we refer to a point with phase space co-ordinates, 
say, (q1,°°* ,s}P1,°°* ,Ps), it will be the notional state space T that we will 
be speaking of. For instance, in the simple special case of a system of two 
particles, each moving in one dimension, a point (q%, q@;p1,p2) will mean 
that the position and momentum co-ordinates of the first particle are 
qi, p1, While those of the second particle are qo, p2 (as indicated above, this 
corresponds to the same state of motion of the system of two identical 
particles as the point (q,q1;p2,p1) of the space I under consideration). 
In accordance with this practice, the symbol dq; --- dq,dp; --- dp, (to be ab- 
breviated as d!*!qd'*!p will denote a volume element in the notional space 
space I. Considering small ranges of the position and momentum co- 
ordinates of the particles (carrying the notional labels ‘1’,‘2’,.--,“N’) from 
q, to q; + dq; and from p; to p; + dp; = 1,--- ,s) one obtains the volume 
element d'*lqd'*\p in T and, given any point in this volume element, we will 
have N! corresponding points in as many volume elements in the same 


space representing the same state of the system of identical particles un- 
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der consideration. 


The phase space co-ordinates are made up of canonically conjugate vari- 
ables. We will, for this purpose, mostly use the Cartesian position and 
momentum co-ordinates of the particles constituting the system of interest, 
though more generally other sets of conjugate variables, such as angular co- 


ordinates and the corresponding angular momenta, may prove to be useful. 


In other words, a volume element d'Iqd'*lp in the notional phase space I 
corresponds to a volume element measuring 4,d!*lqd'*lp in the actual state 
space I’. The replacement d'*!gd'lp — 4dl"lqd'*!p is necessary to account for 
the indistinguishability of the particles of a system when used under an 
integral sign, with an integrand that is a symmetric function of the phase 


space co-ordinates. 


1.3.2.3 The potential energy and the Hamiltonian 


The constitution of the system under consideration is determined by by 
the interaction among its constituents (i.e., the mutual interaction among 
the individual particles in the case of the system of NV particles) which is 
completely described by its potential energy function ®({q}). The dynam- 


ical evolution of the system is determined by the Hamiltonian function 


2 
a 


+ O({a}); (1-7) 


2m 


H({q}, {P}) = 05, 


where m stands for the mass of the particles making up the system (re- 


call that we have in mind a system made up of N identical particles, with 
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some appropriate correspondence between the phase space co-ordinates 
{q},{p} and the Cartesian components of the position and momentum 
vectors of the particles). The value of the Hamiltonian function at any 
phase space point ({q}, {p}) gives the energy of the system in the corre- 


sponding pure state. 


In order to explain the thermodynamic behavior of systems, one has to as- 
sume that the potential energy function © satisfies certain criteria, which 


we state in section 5.2. 


As mentioned above, pure states correspond, in this classical descrip- 
tion, to points in the phase space, specified in terms of the phase space 
co-ordinates {gq}, {p}, though the correspondence is not one-to-one when 
referred to the notional phase space I’. The time evolution of a pure state 
is described by the following set of 2s number of first order differential 


equations, referred to as the Hamiltonian equations of motion 


i=1,2,---,s). (1-8) 


Given an initial (pure) state of the system, the solution to these equations 
determines a trajectory in the phase space (actually, a set of N! equiva- 
lent trajectories in I; it suffices to consider any one of these equivalent 
trajectories), which constitutes the geometrical description of the time 


evolution of the state. 


The dynamical variables (or ‘observables’) pertaining to the system are 


represented in the classical description by real functions of the phase 
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space co-ordinates. The Hamiltonian itself is one such dynamical vari- 
able of special significance (since it represents the energy of the system 
and determines the time evolution in accordance with the equations (1-8) 


stated above). 


The relevant functions are commonly defined in terms of the notional phase 
space I’, in which case these are to satisfy a set of symmetry requirements 
because of the fact that the system under consideration is made up of iden- 


tical particles. 


A dynamical variable A({q}, {p}) (we confine our considerations here to 
functions that do not depend explicitly on time) represents a conserved 
quantity, i.e., is a constant of motion whose value remains constant along 
any given trajectory, if its Poisson bracket with the Hamiltonian is identi- 


cally zero, i.e., if the condition 


_y OAOH OAOH, _ 
ae d. on Opi 7 Op; Bas) = _—s 


i=1 


is satisfied. 


1.3.2.4 The fundamental volume element in the phase space 


The phase space description of classical mechanics mentioned above suf- 
fers from a fundamental limitation that can be explained only with ref- 
erence to the quantum description (refer to section 1.3.3 and also to 3.3 
where, in the latter, the quantum mechanical description of systems is 


outlined in greater details in the case of an ideal gas). 
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Because of the continuous distribution of points in the phase space, the 
approach of classical mechanics, adopted uncritically, allows one to talk 
of pure states arbitrarily close to one another. However, from the opera- 
tional point of view, pure states are defined by means of measurements 
on fundamental observables such as the positions and momenta of the 
particles of the system under consideration. Though, in practice, every 
measurement involves errors of measurement, there is no fundamental 
principle in classical mechanics that prevents the errors from being made 


arbitrarily small. 


However, in the quantum description, which is accepted as being more 
fundamental than the classical one, measurements of the basic observ- 
ables involve unavoidable uncertainties, which cannot be made arbitrarily 
small, even in principle. As a result, the specification, in classical terms, 
of pure states arbitrarily close to one another in the phase space can only 
have a notional significance. Though this basic fact of the existence of 
a fundamental uncertainty in measurements is incorporated consistently 
in the quantum mechanical description of systems, it can be taken into 
account in the classical description in only an ad hoc manner (refer to 


section 4.2.1 for more specific statements). 


In order to arrive at the proper classical limit of the quantum mechanical 
description, the phase space in the classical description of a macroscopic 
system (we will be referring to the notional phase space I’, while adopt- 
ing the measure of dividing the volume elements in this space with NV! 
as explained in sec. 1.3.2.2) is assumed to be partitioned into cells or 


irreducible volume elements with a measure (6q,6p1)(dq26p2) --- (dq,0ps) = 
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hs = h3% (this is depicted schematically in fig. 1-4), where h stands for 
the Planck constant having the same dimension as that of a product of 
the form dqép. The quantum principle of unavoidable uncertainties in the 
measurement of basic observables translates into this ad hoc classical 
description in the form of the statement that two separate pure states 


cannot occupy the same cell in the phase space. 


Such a partitioning of the phase space into fundamental cells becomes nec- 
essary in statistical mechanics when the amount of information involved in 
the specification of a thermodynamic state assumes relevance in the form of 


the entropy or of some other thermodynamic potential relating to that state. 


Thus, in the classical description, when one refers to a pure state corre- 
sponding to a point ({q},{p}) in the phase space (either I or ; as men- 
tioned above we will generally refer to the space I with the proviso of 
identifying points in it obtained by all possible permutations of the mu- 
space co-ordinates of the constituent particles), one actually has in mind 
a state associated with a phase space cell centered at that point. Though 
the working principles of classical statistical mechanics arrived at in this 
manner (refer to chapter 2) do not have a rigorous basis, these are still of 
great practical value. Indeed, these describe quantitatively the thermo- 
dynamic behavior of a considerable variety of systems where results of 
classical statistical mechanics can be interpreted as ones emerging in a 
limiting sense from those of quantum statistical mechanics (refer to chap- 
ter 4, sec. 4.2.1 where the classical approximation to quantum statistical 


mechanics is outlined). 
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dr 


Figure 1-4: Depicting schematically the partitioning of the phase space into funda- 
mental phase cells; a volume element dI is shown in the notional phase space I’ in some 
region (R) of the phase space; a few phase cells are represented by means of small cubes; 
for a system of N particles, each phase cell has a volume h®; in, addition a factor of 
= comes in due to the indistinguishability of the particles making up the system under 
consideration; as a result, a volume element dI in I is to be replaced with ie dV while 
estimating a phase space integral with an integrand that is a symmetric function of the 
phase space coordinates under an interchange of particle indices. 


1.3.3 The quantum mechanical description: pure states 
1.3.3.1 The Hamiltonian and the time evolution 


In the quantum mechanical description, the pure states of a system are 
represented as vectors in a linear vector space - more specifically, a 
Hilbert space made up of complex square integrable functions of the po- 
sition co-ordinates {q}. At times one needs to generalize to Hilbert spaces 
incorporating internal degrees of freedom of the particles constituting the 
system under consideration; however, we will, for the time being, confine 
our considerations to systems whose quantum description has a ‘clas- 
sical counterpart’ which, in particular, means that internal degrees of 
freedom like spin that do not have a classical interpretation, will not fea- 
ture in the description of the system. The observable quantities, includ- 
ing the Hamiltonian, are then represented as linear Hermitian operators 


in this vector space. The special feature of the quantum description, 
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in contrast to the classical one, is that the various operators do not, in 
general, commute among themselves. This imposes great restrictions 
on the results of possible measurements of the various observables in 
any given state of the system. In particular, among the set of operators 
{g},{p}, with the dynamical variables {q},{p} as their counterparts, the 
pairs (q;,p;) (i = 1,2,...s) do not commute among themselves, and one 


has the fundamental commutators 


(0:0:| = CPi — Pid: = Mh, (1-10) 


where |-,-] stands for the commutator of two operators and h = +, h being 
the Planck constant. The ‘hat’ symbol will be used to denote an operator 
though it will, at times, be omitted for the sake of brevity, provided the 


context allows such omission. 


A primer on elementary notions in quantum mechanics that may help in go- 
ing through the present volume (i.e., those parts of it that refer to quantum 
mechanical principles) may be found in [85], since a familiarity with these 
basic notions will be assumed in this book. In particular, we will make make 


use of the so-called Dirac notation in this book. 


The constitution of the system is determined by its potential energy op- 
erator ©, which will be taken to be same function of the operators {4} as 
® is of the dynamical variables {q}; this entails no conflict with the non- 
commutativity of operators since all the operators belonging to the set {q} 


commute among themselves. The Hamiltonian operator, corresponding to 
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the energy observable of the system, is then represented by 


H({4}, {6}) = Be Pi + O({ (1-11) 


(in this context, refer to section 5.2.2.2). The counterpart of a classical 
observable represented by the real function A({q}, {p}) will be denoted by 
the Hermitian operator A({g}, {p}) where we will assume that the func- 
tion A({q}, {p}) is such that the fact of non-commutativity expressed by 
(1-10) does not cause any ambiguity in obtaining the expression for A 
(the Hamiltonian (1-7) is such a function; the possible ambiguity that 
may arise in the case of operators of more general description can be re- 
moved by adopting the so-called Weyl representation of operators, briefly 
mentioned in section 10.2.4 later in the book). The quantum mechanical 
observable A will stand for a conserved quantity if it commutes with the 


Hamiltonian, i.e., if 
[A, H] = AH — HA=0. (1-12) 


A pure state in the quantum mechanical description, represented by a 
vector in the relevant Hilbert space, will be denoted by a symbol like |), 
in the so-called Dirac notation. The time evolution of such a state is given 


by the Schrédinger equation 


ne? — Ay (1-13) 


The quantum mechanical description of a system outlined in the above 


paragraphs makes primary reference to a Hilbert space whose elements are 
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the vectors describing the pure states of the system; the observables of the 
system are then identified with a class of Hermitian operators in the Hilbert 
space. Though we will refer to this mode of description in this book, there 
exists an alternative and more general mode of description that makes pri- 
mary reference to the algebra of operators representing the observables of 
the system, and then introduces the states as mappings from the observ- 
ables to the set of real numbers (refer, for instance, to [121]). This approach 
actually constitutes the rigorous basis of statistical mechanics where the 
system’ of interest is defined in only a limiting sense (the thermodynamic 
limit), as a result of which one cannot, strictly speaking, adopt a description 


that begins with a given Hilbert space. 


1.3.3.2 Symmetry properties of pure states 


A prime concern of statistical mechanics is to adopt an appropriate de- 
scription of the pure states of composite systems made up of identical 
particles, since the mixed states — the objects of ultimate relevance in 
statistical mechanics — are described as probability distributions over 
sets of pure states for such composite systems. We have seen in sec. 1.3.2.2 
how, in the classical theory, this requires a careful definition of the phase 
space (I) whose points have a one-to-one correspondence with the pure 


states of the system. 


In this context, the quantum mechanical setting differs from the classical 
one, though both are based on analogous considerations. In the classical 
description, the individual particles, though identical, can in principle be 


distinguished from one another (as in the case of a set of identical billiard 
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balls) because at any given instant, each is located at a point distinct 
from the others — the state of the system, however, remains unchanged 
under a permutation of the particles so distinguished. In the quantum 
description, on the other hand, the particles are indistinguishable and 


cannot be assigned labels as in the classical case. 


The indistinguishability of identical particles greatly constrains the ad- 
missible wave functions (i.e., the square integrable functions describing 
pure states, abstractly represented by vectors in a Hilbert space, referred 
to as state vectors) for the system of particles under consideration, be- 
cause the relevant state space for the system cannot be described as a 
direct product of Hilbert spaces for individual particles, with any given 
ordering of these factor spaces. Starting with the direct product with 
any given ordering, the set of admissible states can be described as a 
subspace that is either completely symmetric or completely antisymmetric 
under all possible re-orderings, i.e., permutations of the labels used to 


order the factor spaces. 


A simple but instructive illustration of this principle can be given by re- 
ferring to a system of two non-interacting particles. We begin by consider- 
ing two factor spaces, each for a single particle, vectors belonging to the 


two being distinguished by super-indices ‘(1)’ and ‘(2)’. Thus, |“) and 


\°)) denote vectors belonging to the two factor spaces, each denoting a 
possible state for a single particle. The particles being non-interacting, 
it makes sense to speak of a Hamiltonian (say, Hinge) for either of the 
two particles, acting as an operator in either of the two factor spaces. If 


a) and |b) (i = 1,2) be eigenstates of Hgincie in either of the two fac- 
g g 
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tor spaces, corresponding to eigenvalues, say, E,, Ey, (E, 4 E,; note that 
the super-index ‘(i)’ is not necessary in the case of the eigenvalues), then 
there exists an energy eigenvalue EF, + E, for the two-particle system, for 
which the eigenfunction, representing a pure stationary state, can be ei- 
ther +, (Ja) @ |b) + |b) @|a®)) (symmetric) or 4;(la\) @ |b) — [b®) @ Ja) 
(antisymmetric; we will, unless otherwise stated, always consider normal- 
ized state vectors). For the two given single-particle energy eigenvalues 
E, and E,, two other eigenvalues are possible for the two-particle system, 
namely, 2F, and 2£,, but an antisymmetric energy eigenfunction does not 
exist for either of these two; on the other hand, a symmetric eigenfunc- 


a) @ ja) and |b) @ |b) 


tion is possible for each of these, namely, 


respectively. 


The symmetry requirements on the state vectors represent an essential con- 
straint on possible states of a system of indistinguishable particles regard- 
less of whether they are interacting or not. We have considered a non- 
interacting system in the above paragraph so as to indicate how the energy 


eigenstates of such a system may depend on these symmetry requirements. 


Fundamental particles in nature can be grouped into two basic classes: 
the bosons and the fermions. States of composite systems made up of 
identical bosonic particles are necessarily of the symmetric type, while 
those made up of of identical fermions are always antisymmetric. Com- 
posite particles made of identical bosons or identical fermions can also 
be similarly classified - an odd number of identical fermions make up a 


fermion, while any other composite particle made up of identical bosons 
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or fermions belongs to the class of bosons. The requirement that pure 
states of fermions are to be antisymmetric in nature forms the basis of 


Pauli’s exclusion principle (see section 3.3.1). 


In describing the mixed states of statistical mechanics as mixtures of 
pure stationary states with appropriate probability distributions over the 
latter, one has to consider only symmetric states or only antisymmetric 
states in the case of systems of bosons and of fermions respectively. This 


will always be implied throughout our discourse in this book. 


1.3.4 Mixed states: probability distributions 


A major concern in statistical mechanics is to establish a link between the 
constitution of a system, described in terms of the Hamiltonian H or H as 
the case may be (i.e., equivalently, in terms of the potential energy © or 9), 
and its thermodynamic description. Such a program is meaningful only 
for a system of large size (i.e., one approaching the thermodynamic limit), 
since only such a system is amenable to a thermodynamic description. 
For a thermodynamic system, it is meaningless to talk in terms of pure 
states in the phase space or the Hilbert space, as the case may be, and it 


becomes necessary to describe thermodynamic states as mixed ones. 


The state of a system is determined operationally by means of a number 
of observations that may be described as a set of measurements performed 
on it. Such observations are commonly spread over a period of time, con- 
stituting the preparation history of the state. From a fundamental point of 


view, every observation involves some error, however small. In the case of a 


40 


CHAPTER 1. INTRODUCTION: THERMODYNAMIC STATES AS 
ENSEMBLES 


system with a large number of constituents, an enormously large number of 
fine-tuned observations are necessary to identify a pure state. The number 
of such observations, and the errors associated with those, make it impos- 
sible to actually pin down the state of the system to a pure one. In contrast, 
only a relatively small number of observations of a comparatively coarse type 
are necessary for identifying a thermodynamic equilibrium state, implying 
that the latter is a mixture of pure states with some appropriate probability 
distribution over these. Pure states are more meaningful in the description 


of systems made up of only one or a few particles. 


As mentioned above, a mixed state is described by means of a probability 
distribution over a set of pure states. This is made more concrete in the 


following sections. 


1.3.5 Mixed states: classical 


In the classical description, a probability distribution corresponds to a 
non-negative function that describes the relative probabilities associated 
with the various points, described by the phase space co-ordinates ({q}, {p}). 
The actual probability density p({q}, {p}) has to satisfy the normalization 


condition 


[ota {p})dP = 1. (1-14a) 
In this expression 
dN) rq @%) 
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represents the volume measure in the phase space for a system made up 
of N number of indistinguishable particles, for which {q¢}, {p} correspond 
to {r1,r2,°:- ,tN}P1,P2,---, pw}. Here d@rd°%)p stands for the phase 
space volume element (in the notional phase space I) for an infinitesi- 
mally small region extending from r; to r;+dr;, p; to p;+dp; (i = 1,2,--- ,N), 
where the factor jx; is included to take into account the indistinguisha- 
bility of the particles and the partitioning of the phase space into phase 
cells, as explained above in sections 1.3.2.2 and 1.3.2.4. The role of this 
factor in arriving at meaningful results in classical statistical mechanics 
will be apparent in sections 2.2 and 3.2, and also in subsequent sections 


based on classical considerations. 


[Notation:] For particles moving in three dimensions, the ‘dimensionality’ of 
space is D = 3. The phase space co-ordinates {{q},{p}} and {{r},{p}} = 
{r1,Y2,-°: ,%NjP1,P2,:::,Pn} will be used interchangeably. Later in this 
book the position co-ordinates will be collectively denoted by Q and the 


momentum co-ordinates by P. The Euclidean volume element 


dO rd rg des dP ry d3) pid?) po ants d@) py 


is abbreviated as d@%) rd@%) p, 


The physical interpretation of the normalized distribution function p({q}, {p}) 
is that p({r}, {p})dIl represents the probability associated with pure states 


within the volume element d@*) rd p of the notional phase space. 


In a sense, classical statistical mechanics is based on a patched-up foun- 


dation where quantum theory leaves an indelible imprint. It makes use of a 


42 


CHAPTER 1. INTRODUCTION: THERMODYNAMIC STATES AS 
ENSEMBLES 


phase space and deals with a continuous distribution of points but, at the 
same time, has in place a book-keeping system that associates pure states 
with phase space cells of volume h?”, and takes into account the indistin- 
guishability of particles by identifying points resulting from a permutation 
of particle indices. As I have already mentioned, this is not as bad as it 
looks: results of the classical theory emerge from those based on quantum 
considerations in a limiting sense, with the Planck constant going to zero 


(see sections 3.3.4, 4.2.1). 


We will refer to p({q}, {p}) variously as a probability density, a probability 
distribution or a distribution function. Though defined over the entire 
phase space it may, in special instances, have non-zero values only in 
some particular subset of points belonging to the phase space, having a 


value zero at points lying outside the subset. 


It may be mentioned that a pure state can be looked upon as a special in- 
stance of a mixed state where the distribution function is concentrated on 
just one single point, say, ({qo}, {po}), in the phase space (more precisely, 
at a set of N! number of points in I). The probability density is then rep- 
resented by a delta function proportional to []}_, 5(¢: — di9)6(pi — Pio) (which, 
strictly speaking, makes sense only under an integral sign; recall that, 
for a system of NV number of particles, the number of degrees of freedom 
is s = 3N). In other words, the general description of states of a sys- 
tem, whether of the pure or the mixed variety, is achieved by means of 


distribution functions. 


While a pure state corresponding to a point ({q}, {p}) in the phase space 
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is one in which any given observable A({q}, {p}) has a well defined value 
(in particular, the value of the Hamiltonian gives the energy of the system 
in the pure state under consideration), a mixed state corresponds to a 
distribution of possible values of the observable, determined by the dis- 
tribution function p. Given the function p and an observable A({q}, {p}), 
one can work out the mean and the variance (or any higher moment) of 
possible values obtained in measurements of the observable as indicated 


in sec. 2.2 (formulas (2-55a), (2-55b)). 


1.3.6 Mixed states in quantum mechanics: the density 


matrix 


In an analogous manner, a mixed state in the quantum description is 
specified in terms of a probability distribution over some set of pure 


states, say, |v), |W2),--- ,|Wn), the probabilities (or weights) of these states 


in the mixture being, say, wi, w2,--- ,wn, where the w’s are non-negative 


numbers satisfying 


> Wn =1 (normalization). (1-15) 


i=1 


1. From a formal point of view, pure states in the quantum description can 
be looked upon as discretely distributed entities since what is usually 
needed in any given context is not the set of all states, which have a 
continuous distribution in the Hilbert space, but a set of basic states, 
which is a discrete one for a bounded system. A basic set of exceptional 
relevance is the one made up of the eigenvectors of the Hamiltonian 


operator. 
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2. An eigenvector of an operator A is a non-zero vector |a) such that the 
equation 
Ala) = ala), (1-16) 


is satisfied, where a is a scalar (in general a complex number; however, 
for a Hermitian operator A, a is necessarily a real number), referred to 
as the eigenvalue of A corresponding to the eigenvector |a). A pure state 


represented by an eigenvector is referred to as an eigenstate. 


3. The eigenvectors of the Hamiltonian operator represent pure stationary 
states in that, when the system under consideration is isolated from 
the rest of the universe, such a state remains unchanged, picking up 
only an inessential time dependent phase factor in its time evolution 
determined by (1-13). This fact assumes relevance in the determination 
of mixed states corresponding to equilibrium states of a thermodynamic 


system. 


However, the description of a mixed state in terms of a probability distri- 
bution (in the form of a set of probabilities w,, w2,--- ,w, over pure states 
lw), |We),--- , Yn) ) is not unique, though uniqueness is ensured if one re- 
quires that the set |), |wW2),--- , |W») be an orthonormal one. Incidentally, 
as in the classical case, a pure state |W) can be looked upon as a spe- 
cial instance of a mixed state where the set of probabilities has only one 


non-zero element, having a value unity. 


The classical description of mixed states also involves a non-uniqueness. 
For instance, one could choose the phase space co-ordinates {q}, {p} with 
reference to a different set of Cartesian axes for each of the particles consti- 


tuting the system under consideration. The non-uniqueness in the quan- 
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tum context is, however, symptomatic of a distinction at a deeper level be- 


tween the classical and quantum descriptions. 


A more general and convenient representation of a mixed state in the 
quantum description is in terms of what is referred to as a density oper- 
ator (more commonly termed a density matrix; operators in the quantum 
formalism are represented by matrices with reference to any chosen set 


of basis vectors). 


A density matrix (technically, a density operator) p is a Hermitian operator 
with non-negative eigenvalues, having a trace (i.e., the sum of the eigen- 
values) unity. Considering a representation in which the basis vectors 
are the eigenvectors of p, one has a matrix representation of the latter 
where the diagonal elements are its eigenvalues. The operator p can then 
be looked upon as a mixture of pure states represented by the eigenvec- 
tors, with the corresponding eigenvalues as the respective probabilities. 
However, as mentioned above, there exist other possible descriptions of 
the same mixed state as probability distributions over non-orthonormal 


sets of pure states. 


A pure state represented by the vector |), being a special instance of 
a mixed state, can alternatively described by means of a density matrix 


p = |v) (|. A density matrix p represents a pure state if fp? = /. 


Analogous to the classical case, a mixed state, in general, corresponds 
to a distribution of possible values of an observable as revealed by a 


large number of measurements of the latter in this state. The mean and 
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variance of all these possible values of the observable A in a mixed state 


p are given by 


(A) = Tr(@A), (1-17) 


(AA)? = ((A — (4))?) = Tr(@A2) — (TH(AA))?, (1-18) 


where the symbol A (written without a ‘hat’ over it) stands for the observ- 
able represented by A. This distinction will,however, not be observed in 
the interest of brevity, and we will instead speak of ‘the observable A’, in 
a similar vein as ‘the state |w)’ (instead of ‘the state represented by |w)’). 
In the above equations the symbol Tr(-) stands for the trace of an opera- 
tor, i.e., the sum of the diagonal elements of the matrix representing the 


operator in terms of any chosen orthonormal basis. 


Equilibrium states in the thermodynamic description of a system corre- 
spond to mixed states in statistical mechanics where, in the quantum 
formulation, the density matrices are diagonal in the energy representa- 
tion, i.e., a representation in terms of the eigenvectors of the Hamiltonian 


operator. 


Generally speaking, a mixed state represented by a density matrix 6 evolves 
with time, in accordance with the so-called Neumann-Liouville equation 


stated in section 8.4.2.2 (eq. (8-103)). For a density matrix f that is di- 


dp 


agonal in the energy representation, the rate of change 7 


reduces to zero, 


implying that 6 describes a mixed stationary state. 
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I close this section by repeating (refer to 1.3.3.2) that the mixed states 
of a system of indistinguishable bosons are to be made up of symmetric 
pure states alone, while those for a system of indistinguishable fermions 
are to be made up of antisymmetric pure states alone. Further details 
on the symmetry requirement in the quantum theory are to be found in 


section 3.3.1. 


1.4 Mixed states as ensembles: the approach 
of statistical mechanics 


In thermodynamics, equilibrium states of a macroscopic system are de- 
scribed in terms of only a few relevant thermodynamic variables in spite 
of the large number of degrees of freedom of the system i.e., the thermo- 
dynamic states are, in a large measure, incompletely specified ones. This, 
in turn, means that the latter are mixed states of the system described 
by means of probability distributions over sets of pure states, where the 
relevant probability distributions involve only a few thermodynamic vari- 
ables as parameters. The fact that an equilibrium state is invariant in 
time, greatly constrains the mixture of pure states that it can correspond 
to. In statistical mechanics, such mixed states are commonly referred 
to as ensembles (equilibrium or stationary ensembles in the context of 
equilibrium thermodynamics; non-stationary ensembles are relevant in 


describing non-equilibrium states, as outlined in chapter 8.). 


Referring to the quantum mechanical description for the sake of con- 


creteness, and considering a mixture of pure states |v), |W2),--- ,|Wn) with 
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respective weights w , w2,--- ,w,, an alternative way to describe the state 
of the system under consideration would be to imagine a large number 
of copies of the system, where each copy is in one of the pure states 
belonging to the set {|7),|wW2),--- ,|Wn)}, the relative frequency (i.e., the 
probability) of any chosen state, say |7;) (i = 1,2,--- ,n) occurring among 
the set of copies being w;. Such an imagined assembly of fictitious copies 
of the system is then termed an ensemble - in general a non-stationary 


one. 


If each of the pure states in the set mentioned above be a stationary state 
of the system under consideration, i.e., an eigenstate of its Hamiltonian 
operator, then the state of each of the imagined copies in the ensemble 
remains invariant in time. If, moreover, the weights w; (i = 1,2,--- ,n) 
be invariant, then the ensemble as a whole can be described as being a 
time-invariant (i.e., a stationary) one, in which case it can represent an 
equilibrium thermodynamic state of the system (refer to sec. 2.1). Such a 
mixed state — being a mixture of energy eigenstates with time-invariant 
weights, corresponds to a density matrix that is diagonal in the energy 
representation (diagonal density matrices in the energy representation 
may correspond to stationary states more general that the equilibrium 


ones; see chapter 10 for the relevance of diagonal ensembles). 


Classical ensembles are defined and interpreted in an analogous manner, 
where stationary ensembles correspond to time-independent distribution 
functions. Classical distribution functions describing mixed states of a 
system evolve in time in accordance with the so-called Liouville equation 


(refer to sections 8.3.3 and 8.4.2.1), which determines whether a distribu- 
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tion function p({q}, {p}) has a non-trivial time-evolution or is a stationary 
one. The condition of stationarity is that the Poisson bracket of p and the 
Hamiltonian H({q},{p}) is to be zero. This condition is satisfied, in par- 
ticular, by the equilibrium ensembles of classical statistical mechanics 


(sec. 2.2). 


In brief, the terms ‘ensemble’ and ‘mixed state’ refer to the same thing, 
where the former constitutes, in a manner of speaking, a visualization of 
the latter. If there is any difference between the two, it lies in their usage: 
while a mixed state can refer to any system of arbitrary choice, the term 
ensemble is more commonly used in the description of thermodynamic 


states. 


Statistical mechanics has the job of describing time-invariant and time- 
dependent states of systems by means of appropriate ensembles (respec- 
tively, stationary and non-stationary ones). We will focus, in chapter 2, 
on stationary ensembles, mostly in the context of thermodynamic equi- 
librium states of macroscopic systems, though ‘small’ systems (i.e., ones 
away from the thermodynamic limit; refer to sections 2.1.5.2, (2.1.6.2)) 


will also be of interest. 


A stationary configuration of the system under consideration is specified 
operationally in terms of a set of constraints that tell us how to spec- 
ify the sate by means of a relatively small number of parameters. One 
then describes the state in terms of an ensemble corresponding to the 
constraints (described operationally) or, in mathematical terms, by an 


appropriate probability distribution over a set of pure states — in the 
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form of a distribution function (classical) or a density matrix (quantum ). 


Just as the thermodynamic states of a macroscopic system are amenable 
to several mutually equivalent descriptions in terms of sets of thermody- 
namic variables related to one another by means of Legendre transfor- 
mations (refer back to sections 1.2.5, 1.2.6), one can describe station- 
ary mixed states of a system by means of ensembles of various defini- 
tions, where the descriptions by means of these various ensembles be- 
come equivalent in the thermodynamic limit. For ‘small’ systems, on the 


other hand, these different descriptions are not equivalent. 


As mentioned above, the various ensembles correspond to different op- 
erational specifications of the states of the systems under consideration. 
We will, for the sake of concreteness, mostly confine our attention to a 
system of identical particles confined within a given volume V, each mov- 
ing in three dimensions, which is commonly referred to as a simple fluid. 
Depending on the types of constraint in terms of which the states of the 
fluid are operationally specified, one can describe these states by means 
of the so-called microcanonical, canonical, or grand canonical ensembles. 
Though ensembles of other descriptions can also be made use of, we will 
confine our attention to the three referred to above. We will define these 
three standard ensembles of statistical mechanics in the next chapter, 


first in the quantum context and then in the classical one. 


After a few explanatory notes on these three ensembles, we will see in 
chapter 3 how these lead to the thermodynamic description in the simple 


instance of an ideal gas, i.e., a simple fluid confined in a given volume in 


51 


CHAPTER 1. INTRODUCTION: THERMODYNAMIC STATES AS 
ENSEMBLES 


which there is no mutual interaction among the constituent particles (i.e., 
with the interaction potential 6 = 0). A few systems of practical interest 
will also be referred to where the ideal gas description can be fruitfully 


made use of. 


The ideal gas example is a simple and instructive one, which tells us 
how the approach of statistical mechanics can be effective in describing 
the thermodynamic behavior of a system. However, its very simplicity, 
based on the idealization built into its definition (namely, the absence of 
mutual interactions among its constituent particles), makes it an atypi- 
cal system, in contrast to which the real interest in statistical mechanics 
lies in systems made up of interacting particles. In particular, an ideal 
gas differs from a simple fluid with even very weak interactions among 
the constituent particles, in a number of fundamental respects, though 
the two systems (i.e., the ideal gas and the fluid with weak interactions) 
do not differ much in their thermodynamic behavior over a wide range 
of physical conditions. In other words, in spite of being atypical from a 
fundamental point of view, the ideal gas is of great relevance in thermo- 


dynamics and statistical mechanics. 


The fact that a macroscopic system is made up of a large number of con- 
stituents, will be seen to be of essential relevance when one asks the 
question as to why the stationary ensembles of statistical mechanics are 
of relevance in describing the equilibrium thermodynamic states of sys- 
tems. This, however, takes us beyond the confines of equilibrium statis- 
tical mechanics where one has to consider non-equilibrium states to see 


whether these have a common tendency of evolution towards equilibrium 
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states. Questions relating to this and other relevant issues will be taken 


up in chapters 9 and 10 later in this book. 


One final observation of quite considerable relevance is the following: 
While the descriptions in quantum statistical mechanics of systems of 
bosons and of those of fermions differ fundamentally from one another, 
the basic thermodynamic formulas apply equally to the two types of sys- 
tems because, at a fundamental level, the thermodynamic description is 
indifferent toward the nature of the constituent particles of a macroscopic 
system (for instance, the formula F' = U — 7S, in familiar thermodynamic 
notation, holds for a simple fluid made up of a given number of particles, 
regardless of whether the particles are bosons or fermions). The con- 
sequences of the fundamental thermodynamic formulas, however, work 
out differently for the two types of systems. For instance, the equation of 
state ( against V, again in a familiar notation) of a dilute gas of fermions 
differs from that of a dilute gas of bosons, because the equation of state 


of a system depends on factors relating to its constitution. 


53 


Chapter 2 


The Equilibrium Ensembles of 


Statistical Mechanics 


2.1 Introducing the equilibrium ensembles of 


quantum statistical mechanics 


2.1.1 The microcanonical ensemble 


In the microcanonical ensemble, each imagined copy of the system under 
consideration is assumed to be in a stationary state with energy (F) lying 
in some small range of values from U to U + 6U (with dU << U and oth- 
erwise unspecified for the time being), where all such stationary states 
occur in the ensemble with uniform probability. Thus, if W be the to- 
tal number of independent stationary states of the system, referred to as 


microstates, within the above energy window, then the relative weight of 
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each of these is given by 
waa (U<E<U+6U;6U << JU). (2-1) 


At this stage, the above definition is purely formal, without any physical 
relevance ascribed to it. We assume, however, that U and 6U are chosen 
in such a manner that there exist at least one energy eigenvalue F within 


the range indicated (i.e., W # 0). 


The microcanonical ensemble introduced above may also be looked upon 
as a mixture of all the independent stationary states of the system, where 
the weights of the stationary states with energy lying outside the range 


indicated above are all zero. 


The central postulate of statistical mechanics is that, in the thermody- 
namic limit (refer to chapter 5), the microcanonical ensemble defined 
above represents the state of thermodynamic equilibrium of the system 
with internal energy U, volume V, and number of particles N, opera- 
tionally specified as one made up of N number of particles contained 
within a volume V, and isolated from the rest of the universe, with zero 
probability for all its stationary states with energies lying outside the 
range U toU + 0U. 


In the thermodynamic description, N corresponds to the mole number times 


the Avogadro number. 


In the thermodynamic limit (or, from a physical point of view, close to 
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it) the formal definition of (2-1) is to be supplemented by the following 


additional postulate: 


The thermodynamic entropy of the system (a pure fluid in the present 
context) in an equilibrium state with specified values of internal energy, 


volume, and particle number (U, V, N) is given by 


S=kglW, (2-2) 


where kp stands for the Boltzmann constant. The quantity W may be 
referred to as the microcanonical partition function in analogy with the 
canonical and the grand canonical partition functions (to be introduced 
later in this chapter) that appear as the inverse of normalization con- 
stants in the expressions for the respective density matrices (or, equiva- 
lently, for the probabilities of the respective states). Eq. (2-2) is referred 


to as the Boltzmann formula for the entropy of an isolated system. 


Though we have, for the sake of simplicity, referred to a pure fluid as our 
system of interest, the definition (2-1) extends to a composite system made 
up of several species of particles. Using the superscript a to distinguish the 
various different species (referred to as components) in the system of par- 
ticles under consideration, a state of thermodynamic equilibrium is char- 
acterized by given values of U, V, and {N°} (a = 1,2,---), where the ther- 
modynamic limit corresponds to V + oo, N® + oo (which implies U — ov, 
because of the nature of the potential © we will specify in section 5.2 later 
in this book. It is also possible that the system is made up of more than one 
phases of each component. In most of what follows, however, we will keep 
our considerations as simple as possible, and assume that our system in- 


cludes only one component, since the generalization of our results to more 
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than one components does not involve new principles. The coexistence of 
more than one phases (see sec. 1.2.6 in chapter 1) will also not be consid- 
ered for now (the statistical mechanics of first order phase transitions will 
be taken up in chapters 5 and 6; see, in particular, section 5.7). Finally,the 
system under consideration may be subjected to external influence such as 
an electric or a magnetic field. These will also be left out of consideration 


for now. 


Recall that what we are now engaged in is setting up a correspondence 
between equilibrium thermodynamic states of a system and the equilibrium 
ensembles of statistical mechanics, of which the microcanonical ensemble 
constitutes a particular instance. This leaves open the entire question of 
why and how the system, subjected to given constraints, attains the state 
of equilibrium, starting from any chosen initial state. This, of course, is 
a problem of central relevance and vast complexity, which we will address 
to only a very limited extent in later chapters of this book. The process of 
attaining equilibrium may, for instance, involve phase transitions, chemical 
reactions, thermal conduction, and viscous dissipation. The equilibrium 
ensembles of statistical mechanics refer only to the final thermodynamic 


states, attained after the cessation of all such processes. 


Let the energy eigenvalues of the system be labeled Fo, F),--- ,Fi,---, 
where the index i increases from the lowermost energy upward (E, corre- 
sponds to the ground state of the system). With this labeling of the energy 
eigenvalues, let the energies lying within the specified range (U to U + dU) 
be £741,:-: ,£y,:w for some appropriate J, where £;,, denotes the lowest 


energy within the specified range. 
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Recall that the number of independent states lying within the above energy 
range has been assumed to be W; the set of stationary states can always be 
chosen to be an orthonormal one; here we assume that the energy values are 


non-degenerate; this is presumably the general situation for large systems. 


The density operator (hereafter we will use the term ‘density matrix more 
frequently, even when not referring to any specific representation in terms 
of some specified set of basis vectors) of the mixed state described by the 


microcanonical ensemble is then given by 


, ew 
pin = TF S> Fi) (Bil. (2-3) 


i=J+1 


In the energy representation (i.e., one for which the basis vectors are the 
eigenvectors corresponding to the stationary states), the density operator 
is represented by a matrix that has a single diagonal block of a W x W 
matrix of which all the diagonal elements are of value ;,, all the remaining 
elements of the density matrix being zero. In other words, there may exist 
energy eigenstates (generally speaking, an infinity of those) other than the 
ones featuring in (2-3), with energies outside the range U to U+ dU. The 


elements of the density matrix in the energy representation, defined by 


Prix = (E;|Pm|E;) (G9 = 1,2, ae Ms (2-4a) 
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are seen to be 


ES 
Ww 
Pmij = 9 (for all other 2, 7). (2-4b) 


bas = (fort = J+1,J+2,---,J+W), 


As mentioned earlier, we will use the so-called Dirac notation in this book 
for the sake of convenience. A brief primer on quantum mechanical notions 


and notation may be found in [85]. 


For any observable A pertaining to the system, its average value in the 


(mixed) state represented by the microcanonical ensemble is given by 
(A) = Tr(@nA) = = D> (Aa), (2-5) 


where A;; stands for the diagonal elements of the matrix for the operator 


Ain the energy representation. 


Finally, going over to the thermodynamic limit, the entropy of the equilib- 
rium state represented by the microcanonical ensemble (refer to eq. (2-2)) 


can be expressed in the form 
S = —kpTr(fm ln pm), (2-6) 


where In /,, denotes the logarithm of /,, (check this out). It may be men- 
tioned that, for any given density matrix /, plnp is defined even when 
some of the eigenvalues of / are zero (recall that 6 does not have negative 


eigenvalues, and some eigenvalues have to be positive, since it is of unit 
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trace). 


Incidentally, given any mixed state represented by the density matrix p, 


the expression 
Sq =—Tr(6lnp), (2-7) 


is said to define its quantum mechanical entropy (or von Neumann en- 
tropy), of which the entropy of the microcanonical ensemble is a partic- 
ular instance (the factor kp in (2-6) enters on physical grounds, so as to 
achieve agreement with the thermodynamic entropy; at times, it is in- 
cluded in the definition of the von Neumann entropy). For a pure state, 
which is a special instance of a mixed state with a density matrix of the 
form p = |w)(v| for some vector |W), the von Neumann entropy works out 


to Sq = 0. 


Referring to a representation in which a given density matrix / is diagonal 
(in the case of an equilibrium ensemble, this coincides with the energy 
representation) with diagonal elements p; (i = 1,2,---), the von Neumann 


entropy is given by the expression 


Sq=— >) pln pi, (2-8) 


while the thermodynamic entropy S is obtained (in the case where / rep- 


resents a thermodynamic state) by multiplying this with kg. 


In the case of a thermodynamic system (i.e., one conforming to the ther- 


modynamic limit or, informally speaking, one close to this limit) we will 
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see (section 5.3.4) that the microcanonical ensemble represents, as men- 
tioned above, an equilibrium thermodynamic state with given values of 
U,V,N (or, more precisely, with given u = limy,.%, v = limy.. 4) and, 
moreover, the states defined in this limit are stable in the thermodynamic 
sense. One expresses this by saying that the thermodynamic limit of the 
microcanonical ensemble exists. If, now, one considers a family of such 
ensembles with values of U,V, N varying continuously, and makes use of 
the entropy expression (2-6), then it can be seen that all the thermody- 
namic functions can be defined for the system, obeying the fundamental 
thermodynamic relations, i.e., in other words, the family of microcanon- 
ical ensembles so considered represents the thermodynamic behavior of 


the system. 


In particular, the temperature can be defined by referring to such a family 
by means of the formula 


Os 


T l= (aa) vv" 


(2-9) 


2.1.1.1 Microcanonical ensemble: alternative definition 


There is an alternative definition of a microcanonical ensemble, one that 
differs formally from the one outlined above, but is equivalent to the the 
latter in the thermodynamic limit. In this alternative definition, the en- 
semble is assumed to be a mixture of energy eigenstates with energy val- 
ues lying between EF, (the ground state energy; one can choose the energy 
scale such that E) = 0) and some upper limit, say U, where the weights 


associated with all these eigenstates are equal, being 7; each, where W 
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now stands for the number of independent stationary states with energies 


lying in the above range. 


That the two definitions can turn out to be equivalent despite the fact 
that one of these involves stationary states with energy values lying in 
narrow the range from U to U+6U (with dU << U) while the other involves 
stationary states with energies lying in the range from EF all the way up 


to U, may appear to be paradoxical. 


However, the equivalence emerges only in the thermodynamic limit when, 
because of the large values of N and V, most of the stationary states with 
energies in the range 0 to U turn out to be packed within a thin energy 
slice near the upper limit U. A simple instance of this is provided by the 
classical ideal gas, for which the thermodynamic limit will be considered 
in sec. 3.2.1 (though defined within the classical framework, the ideal gas 
can be described in the quantum context as well, in the so-called classical 


limit; see sec. 3.3.4). 


2.1.2 The maximum entropy principle 


Notice that the only parameter (apart from the constant kp) used in defin- 
ing the microcanonical ensemble for any given system (the system fea- 
tures are all incorporated in the Hamiltonian H ) is the constant value (U) 
of the energy eigenvalues involved in the mixture (the fact that a small 
range of energy is referred to, is a consequence of unavoidable uncertain- 
ties in the specification of energy, and may be assumed to be arbitrarily 


small). The system under consideration being an isolated one, the spec- 
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ification of the range of energy eigenvalues is the only constraint charac- 


terizing its state. 


Recall also that, generally speaking, the density matrix corresponding to 
a stationary ensemble is to be a diagonal one in the energy representa- 
tion, i.e., (£;|6|E;) has to be of the form w,d;; (i,7 = 1,2,---), where 4,; 
stands for the Krénecker delta symbol. Here and in the following we will 
assume for the sake of simplicity that the energy levels of the system 
are all non-degenerate though, for the present, all our considerations ap- 
ply for systems with degenerate energy levels as well. For large systems 
with mutual interactions among their constituents, the non-degeneracy 


assumption is presumably a valid one. 


The diagonal elements w; (i = 1,2,---) of the density matrix, represent- 
ing the weights of the corresponding energy eigenstates in the station- 
ary ensemble can, in general, be functions of the energies F;. In the 
special instance of the microcanonical ensemble, these are all equal for 
i= J+1,---,J+W (in the notation of sec. 2.1.1), and zero for all other 


values of the index i. 


Consider, now, the following question. Given a set of constraints on the 
system, can one formulate a principle that lets us determine the depen- 


dence of w; on E£;? 


In the present context, the question posed above boils down to the follow- 
ing: given a set of time-independent constraints on the system, can one 


formulate a general principle from which one can determine the depen- 
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dence of the weights w; on the energies E;, from which the above result 


relating to the microcanonical ensemble can be derived as a special case? 


Such a principle can indeed be stated, known as the maximum entropy 
principle: the weights w; of the mixed stationary state of the system are 
to be such that the von Neumann entropy Sg has a maximum value com- 


patible with the given constraints. 


In applying this principle to the case of the microcanonical ensemble we 
note that, in the energy representation the diagonal elements p; in(2-8) 
are the weights w; (for those values of the index i that correspond to 
the states involved in the mixture; for the other possible values of the 
index i, w; are all zero by the nature of the constraint, and need not be 
considered). It is then not difficult to check that the unique distribution 
for which Sg = — ~ w;lnw; attains a maximum value is given by w; = 
w (i =J+1,J+2,---J+W) (check this out; the constraint in this case 
is taken into account by taking w; = 0 for i having values outside the 
range J+1,---,J+W, in the notation of sec. 2.1.1). In other words, the 


microcanonical ensemble constitutes an instance of the validity of the 


maximum entropy principle stated above. 


Notice that the principle, as stated here, is not a manifestly dynamical 
one (i.e., it does not tell us that an arbitrarily chosen initial mixed state 
must evolve in time towards one of maximum entropy), nor does it tell us 
why a given set of time-independent constraints must correspond to the 
uniquely determined stationary ensemble implied by the requirement of 


maximum entropy. For instance, one could have a stationary ensemble for 
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which the weights w; make up an arbitrarily chosen function of the energies 
E; (¢ = 1,2,---) — not necessarily one that makes Sg a maximum. One can 
meaningfully address these two issues for large systems, or for subsystems 
of large systems, where these exhibit thermodynamic behavior. The issue of 


thermalization will be taken up in chapters 9, 10. 


2.1.3 Systems away from the thermodynamic limit: non- 


thermodynamic variables 


Considering a system away from the thermodynamic limit, i.e., one of a 
finite size, and assuming it to be in a mixed stationary state described by 
the microcanonical ensemble, or by any of the other ensembles defined 
below in this chapter (either in the quantum or the classical context), 
one can ask the question as to whether a set of state variables (i.e., ones 
having well defined values in any specified mixed stationary state) can 
be defined for it with features analogous to the thermodynamic variables. 
We will consider the quantum mechanical microcanonical ensemble here 
for the sake of concreteness, similar statements being applicable for the 


other ensembles as well. 


It is to be noted at the outset that, for a system of arbitrary size, the 
parameters U,V, N do not uniquely specify a state of the system in mi- 
croscopic terms, and there are a host of other parameters on which the 
state must depend. For instance, dU is one such parameter, since the 
number of states W is likely to depend on dU in a complex manner. As 
we will see (refer to section 3.2.1 where the case of a classical ideal gas 


is considered; see also section 5.3), dU becomes irrelevant in determining 
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the state (a mixed stationary state in general) only in the thermodynamic 
limit, and that too only if oy goes to zero in a particular manner. Consider 
next the region (call it R) of volume V in which the particles making up 
the system is confined. The energy eigenvalues E£;(i = 0,1,2,---), which 
form a discrete set for a system confined within a finite volume, are not 
determined solely by the volume V, but depend, in general, on the shape 
of the region in a complex and delicate manner. The shape, however, 
cannot be uniquely specified in terms of only one or a few parameters 
(fig. 2-1 will give you an idea) and, indeed, requires an infinite number of 
parameters for a complete specification. Another set of factors of equal 
relevance relates to the boundary conditions at the surface enclosing the 
region R, depending on the interaction of the particles of the system with 
those making up the boundary wall, or with the rest of the universe. Here 


again, a set of factors of great complexity is likely to be involved. 


In other words, the energy eigenvalues F; (i = 0,1,2,---), as also the num- 
ber W, depend on a great many parameters other than the ones (U,V, NV) 
considered in above the definition of the microcanonical ensemble. De- 
noting all these parameters (possibly making up a non-denumerably infi- 
nite set) by the symbol €, a complete specification is obtained in terms of 
(U,V, N,€), among which little is usually known about ¢, and the param- 


eters making up € cannot be specified with precision. 


In the thermodynamic limit, the parameters £ do not matter, i.e., as V,N 


are made to go to infinity (with + doing to a finite limit), the state of the 


U Vv 


system is uniquely specified in terms of 5, ~. In other words, the effect 


of these additional parameters becomes arbitrarily small, provided these 
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Figure 2-1: Depicting a region of space of a given volume V, with a number of dif- 
ferent possible shapes of the boundary surface; (A) a section of a cubical region of edge 
length a (a? = V); we will mostly assume such a cubical shape within which the system 
under consideration is confined; (B) a section of a region in the shape of a rectangular 


parallelepiped with edge lengths a, b,c (abc = V); the energy levels of the system confined 


within such a region will depend on the aspect ratios 4, £; (C) a boundary surface of 


a 


more complex shape, where a larger number of parameters are necessary to specify the 
shape; in the limit V — oo, the thermodynamic behavior of the system under consid- 
eration does not depend on these additional parameters, provided that the volume is 
made to go to infinity in an appropriate manner, which is conformed to by a large class 
of systems of practical interest. 

are held at fixed values or, more generally, their variation is controlled 
appropriately. For instance, assuming the region R to be in the shape 
of a rectangular parallelepiped with edge lengths a,b,c. one has to con- 
sider the aspect ratios 2, £ in addition to the volume V (additionally, there 
may be many other shapes, with corresponding parameters characteriz- 
ing these shapes, which we do not consider for the time being). If now, 
we go over to the limit V — oo, with the aspect ratios going to zero, i-e., 
consider an infinitely long needle shaped region R, then a complete spec- 
ification of a mixed stationary state in terms of U,V, N will not be possible 
since, in this case, surface energy terms are likely to have a great effect 


on the behavior of the system. Moreover, the system behavior will depend 


on the manner which the aspect ratios go to zero. 


Assuming that the parameters specifying the shape of the region R are ei- 
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ther held constant (which means, in the case of a rectangular region, that 
the aspect ratios have fixed finite values) or else, are allowed to vary only 
in some appropriately controlled manner, the behavior of a macroscopic 
system (a one-component fluid in the present context) will be determined 
solely by U,V,N in the thermodynamic limit. More specifically, this is 
the sense in which the thermodynamic limit can be said to exist. In 
this sense, all the additional variables € (made up of the shape parame- 
ters and the energy width 6U) become irrelevant in the description of the 
system behavior, and their effect can be said to have been quashed in 
virtue of the large size of the system. The variables € can then be termed 
‘non-thermodynamic’ ones in contrast to the basic thermodynamic state 
variables U,V, N. Alternative sets of the basic variables can be introduced 
by means of the canonical and the grand canonical ensembles considered 
below in the present chapter, in keeping with alternative thermodynamic 


descriptions related by means of Legendre transformations. 


2.1.4 Families of ensembles: enlarging the set of state 


functions by means of derivatives 


Starting from the basic variables defining any given ensemble, one can 
consider families of ensembles characterized by continuously varying val- 
ues of the basic variables, and introduce a larger set of state functions by 
means of derivatives. For instance, considering a family of microcanoni- 
cal ensembles with continuously varying values of U, one can define the 
temperature of a state by the formula (2-9). However, strictly speaking, 


this formula does not make sense for a system away from the thermo- 
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dynamic limit since, for a system of finite size, the spectrum of energy 
values FE; (i = 0,1,2,---) is discrete, and the entropy (as also the num- 
ber of relevant states W) is a discontinuous function of U (reason this 
out). In the thermodynamic limit, on the other hand, the energy levels 
are densely packed and one can indeed define the temperature by means 
of an appropriate limiting procedure. In a similar manner, one can define 
the pressure of the system (recall that we are referring to a simple fluid 
as our system of interest for the sake of concreteness) by considering a 
family of microcanonical distributions with continuously varying values 


of V and setting 


(2-10) 


The symbol p used to denote pressure is not to be confused with the same 
symbol used at times to denote probabilities, as in formula (3-45) and other 
similar expressions; the intended meaning for the symbol will always be 


clear from the context. 


Once again, the derivative makes physical sense in the thermodynamic 
limit, when the additional variables € get quashed. An analogous defini- 
tion of the chemical potential of the system would require differentiation 
with respect to N which, however, is a discrete variable, taking up only in- 
teger values. In the thermodynamic description, on the other hand, N is 
replaced with the continuous variable v (the mole number; N = vA, where 


A is the Avogadro number), and one can define the chemical potential per 
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mole (ji) by means of differentiation with respect to v, 
ji = -T(=)yy- (2-11) 
The basic thermodynamic formula (refer to (1-2)) 
TdS = dU + pdV — fadv, (2-12) 


can then be seen to follow, as do all the standard thermodynamic rela- 


tions including, in particular the Euler formula 
U=TS—pV + py, (2-13) 


which corresponds to the fact that the variables U,S,v are extensive ones. 


A similar procedure can be followed by starting with the other ensem- 
bles, as will be indicated below in sections dealing with the canonical and 


grand canonical ensembles. 


The variables T, p, 7 introduced in the above manner can be shown to satisfy 
physical requirements expected of these. For instance, considering a given 
quantity of gas in a chamber of volume V, imagine a constrained equilibrium 
in the presence of an adiabatic partition, where the left and the right halves 
of the chamber separated by the partition contain gas with parameter values 
U,, $1,T, and U2, Sz, T respectively, with, say, T, > T>. If now the constraint 


is removed and the two halves are allowed to attain equilibrium, the changes 
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in the parameter values in the two halves will be related by 


Mik. 
AS; + AS» > 0, 
TAS, = AU i, ToAS> = AU, (2- 14a) 


where the first line results from conservation of energy, the second line from 
increase of entropy, and the third line from (2-9) (we assume the changes to 


be infinitesimal). Together, these formulae give 


—AU2z = AU, < 0, (2-14b) 


telling us that energy (in the form of heat) flows from the region with a higher 
value of the parameter T to the region with a lower value. This is precisely 
what is expected of temperature as a state variable. Analogous considera- 
tions apply to the case of p representing pressure and of / representing the 


molar chemical potential. 


The question can now be asked as to how far a corresponding approach 


works for systems away from the thermodynamic limit. 


Here a fundamental problem comes up — one relating to the non-thermodynamic 
variables denoted collectively by €. It is, however, interesting to see 
whether a set of variables analogous to the thermodynamic ones can be 
defined by assuming these non-thermodynamic variables to be held fixed. 
Such definitions can be no more than formal ones since, from the oper- 
ational point of view, the infinite number of variables € can be specified 


precisely only ideally. 
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Even making the idealized assumption of a constant value of the vari- 
ables £, one cannot make much headway with the microcanonical en- 
semble since, for a system of finite size, the discreteness of the energy 
levels stands in the way of defining a variable analogous to the tempera- 
ture, and the discreteness of the ‘entropy’ S (formula (2-2); the number of 
states W is a finite integer for a finite system) stands in the way of defin- 
ing a variable analogous to the pressure. In the case of the canonical and 
grand canonical ensembles, on the other hand, the situation is different 


— and interesting too, as we will see below. 


2.1.5 The canonical ensemble 


Continuing to consider a system of N particles in a volume V, described 
by the Hamiltonian (1-11), let us once again denote the stationary states 
as |Eo),|Fi),--- ,|Ei),---, and the corresponding energy eigenvalues as 
Eo, &i,-:: , Ei,--- where, commonly, the energies are arranged in a non- 
decreasing order, with Ey as the ground state energy. There is ground to 
suppose that, in the thermodynamic limit, the energies are non-degenerate, 
i.e., each energy has only a single eigenstate associated with it (up to an 
inessential phase factor). However, this supposition is not an essential 


one in the formal definition of the ensembles we consider. 


The canonical ensemble is then defined to be a mixture of all the energy 
eigenstates of the system, where a state of energy £; (i = 1,2,---) carries 
a weight w; « exp(—GE;), and where 6 is a parameter that is, for the 
time being, left unspecified. Since all the weights taken together have to 


add up to unity (refer to (1-15), which is written for a finite set of states 
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making up a mixture), one obtains 


i 7 (-BE) =o: (2-15a) 


where the denominator appearing in the normalization constant (7) is 


given by 
Z. = )_ exp(—BE,), (2-15b) 


this being termed the partition function pertaining to the canonical en- 
semble. In formula (2-15b) for the partition function, the summation 
extends over all the energy eigenvalues of the system (recall that the mi- 
crocanonical ensemble was defined as a mixture of only those eigenstates 
for which the energy lies in a small interval and that one can recast the 
definition such that all the stationary states are involved with the added 
clause that the weights for all those with energies outside that interval 


are zero). 


Like the microcanonical ensemble, the canonical ensemble is a stationary 


one, the density matrix being diagonal in the energy representation: 


1 
5. = —BEi| Ep. : : 
paz 2 E,)(E;|, (2-16a) 
with diagonal elements 
1 —BE; (; 
Pew = Cie = 1,2,---), (2-16b) 


these being, precisely, the weights w; of (2- 15a). 
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Given an observable A, its mean value in the state represented by the 


canonical ensemble is given by 
A 1 
= - = —BE: A. Z 
(A) = Tr(f,A) = Z > e€ Ai (2-17) 


where A;; (i = 1,2,---) stands for the diagonal element (£;|A|E;) of the 


matrix representing Ain the energy representation. 


Analogous to the definition of the entropy in terms of the number of mi- 
crostates W within a specified narrow energy interval, i.e., the micro- 
canonical partition function, one defines the ‘free energy’ of a state de- 


fined by the canonical ensemble as 
F=-6'lMZ, (2-18) 
while the ‘temperature’ is defined by the formula 
k 


Peep (2-19) 
B 


Though these two definitions remain valid even for a finite system, they 


assume relevance in the thermodynamic limit. 


Since the ‘temperature’, as defined above, can be made to vary contin- 
uously, one can define the ‘entropy’ by means of differentiation with re- 


spect to T' as 


(2-20) 


74 


CHAPTER 2. THE EQUILIBRIUM ENSEMBLES OF STATISTICAL 
MECHANICS 


and also the ‘pressure’ as in (2-10) 


OF 


p= (aa) 0 (2-21) 


(recall that the canonical distribution is defined for a fixed integer value 


of N as in the case of the microcanonical ensemble). 


Contrary to the microcanonical ensemble, T and S are defined even for 
a finite system, though such definitions are formal, since these rest on 
the counterfactual assumption that the non-thermodynamic variables & 
are held at constant values, which is why quotes are used in using the 
designations ‘free energy’, ‘temperature’, ‘pressure’, and ‘entropy’. In ac- 


cordance with these definitions, the basic thermodynamic formula 
dF = —SdT — pdV, (2-22) 


is satisfied, analogous to (1-1b). Further, the ‘internal energy’ U is defined 
(once again, formally — assuming that the set of non-thermodynamic 


variables € is held fixed) as 
- 1 
= = : —BE; 


What is interesting to note in this context is that the variables defined 


formally as above satisfy the thermodynamic relation 
F=U-TS, (2-24) 


among themselves (check this out), even though the functions F,U,S do 
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not turn out to represent extensive variables for a finite system. As in 
the case of the microcanonical ensemble, the chemical potential cannot 
be given a formal definition in a consistent manner for a finite system in 


the canonical ensemble as well. 


All these variables defined formally for a finite system acquire physical 
relevance (going over to the corresponding thermodynamic variables) in 
the thermodynamic limit, when the effect of the non-thermodynamic vari- 
ables € is quashed by the large size of the system. In other words, the 
formal relations among the variables analogous to the thermodynamic 
ones continue to be preserved for each given V and N and for any speci- 
fied set of values of the variables €, right up to the thermodynamic limit, 


when the latter become irrelevant. 


In the thermodynamic limit, one can also define the chemical potential j 
as 


OF 


= (Fw 2-28) 


whereby the formula (1-2) is seen to be satisfied, and the thermodynamic 


variables U, S, F acquire the property of extensivity. 


Considerations of this section (as also of the section 2.1.5.2 below) can 
be interpreted by saying that the canonical ensemble provides us with a 
formal thermodynamic model for a system of arbitrary size. The model is 
only a formal one since, away from the thermodynamic limit, it rests on 


the c assumption that the non-thermodynamic variables € are to be held 
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constant at a specified set of values. 


I close this section by pointing out that the entropy defined in (2-20) 


agrees with the formula 
S(= kpSq) = —kp So wilnw;, (2-26) 


where w; (i = 0,1,2,---) are the weights defined in (2-15a), (2-15b) (check 


this out; the statement holds formally even for a finite system). 


The microcanonical and the canonical ensembles, both of which repre- 
sent mixed stationary states, differ in the constraints they conform to — 
the former is a mixture of pure stationary states all of which belong to 
energy eigenvalues lying within a small range, while the latter involves 
stationary states with all possible energies, with the constraint that the 
mean energy of the system (refer to eq. (2-23)) is to have some specified 


value, say, U (the internal energy in the thermodynamic description). 


From the operational point of view, the microcanonical ensemble corre- 
sponds to an isolated system though, strictly speaking, a complete isola- 
tion is never possible (we will have more to say on this later), while the 
canonical ensemble corresponds to a system interacting with other sys- 
tems in such a way that its mean energy has some specified value. Once 
again, this can be interpreted as its internal energy in the thermodynamic 
description; we will see later in section 5.4 how the two interpretations of 
the internal energy, one in terms of the microcanocial ensemble and the 


other in terms of the canonical ensemble, coincide in the thermodynamic 
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limit. Though the canonical ensemble is defined in terms of the parame- 
ter 3, and the mean energy ((£), also denoted by the symbol U) does not 
appear directly as a parameter in the defining formula (2-15a), the two 


quantities (F) and { are nevertheless related by formula (2-23). 


As a simple example of the quantum mechanical canonical distribution, 


we consider a single simple harmonic oscillator at temperature T. 


The 1D quantum mechanical simple harmonic oscillator is a system de- 


scribed by the Hamiltonian 


A 1 1 
AH =p + mu, (2-27) 
2m 2 


where g,p denote the operators for the position and momentum variables 
of the oscillator, m stands for the mass, and w for the angular frequency. 
Its energy eigenvalues are of the form (5 + n)hw (n = 0,1,2,---), with n 
the quantum number characterizing the corresponding eigenstate |n). As- 
suming that the oscillator is in equilibrium at temperature T (say, with 
a thermal bath; the canonical distribution remains meaningful for‘small’ 


systems (see sec. 2.1.5.2), the partition function is obtained as 


[1D harmonic oscillator :] Z. = >: eo Bhwo(nt+3) 
n=0 


en 3 bh 


This result on the single simple harmonic oscillator comes in handy in a 


considerably wide range of problems in statistical mechanics. 
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2.1.5.1 The canonical ensemble in the light of the maximum en- 


tropy principle 


One can now address the following variational problem: consider a mixed 
state made up of stationary states |F;) (i = 0,1,2,---) with arbitrarily spec- 
ified weights w; and with some specified value (say, U) of the mean energy 
(H). What distribution of the weights w; (i = 1,2,---) would then cor- 
respond to the maximum value of the entropy Sg consistent with this 
constraint? This problem can be solved by the method of undetermined 
multipliers which leads to the result that there exists a certain parameter 
6 in terms of which the required distribution of weights is given by the 
formulas (2-15a), (2-15b), the relation between ( and U being precisely 


the one given by (2-23) (check this out). 


Speaking in more precise terms, the above statement is made up of two 
parts. The first part is the statement that a the first order variation of en- 
tropy vanishes when the parameters w; make up a canonical ensemble, 
while the second part is the statement that the second order variation of 
the entropy is negative when the canonical distribution of the w,’s is com- 
pared with any neighbouring distribution. You will have a more complete 
understanding of these two parts of the maximum entropy principle from 
section 2.1.5.3 where the analogous principle of minimum free energy is 


outlined. 


The maximum entropy principle actually tells us more, to the effect that the 
canonical distribution corresponds to the global minimum of Sq for the given 
value of the mean energy. However, this is not of relevance in the present 


context. 


79 


CHAPTER 2. THE EQUILIBRIUM ENSEMBLES OF STATISTICAL 
MECHANICS 


On defining the ‘temperature’ corresponding to the canonical distribution 
by (2-19), one can check that the temperature, so defined, is indeed com- 
patible with the physical attributes associated with the thermodynamic 
concept of temperature (refer to sec. 2.1.5.2 below). It may be mentioned 


that this result makes no reference to the system size. 


In the thermodynamic limit, the maximum entropy principle corresponds 
to the entropy principle of thermodynamics: consider two equilibrium 
states of a given system with specified values of V, N and also of U (the 
requirement of a specified value of U can be replaced with any other con- 
straint compatible with the given values of V, N, refer to sec. 2.1.2), of 
which one does not involve any additional constraint other than the ones 
specified, and the other does involve additional internal constraints; if, 
now, these additional constraints are removed from the second state, the 
system will make a transition to the first, with an increase of entropy 


(refer to [15], chapters 1, 2). 


2.1.5.2 Thermodynamics of ‘small’ systems 


In sec. 2.1.5 we saw that, in terms of the canonical ensemble, state func- 
tions analogous to the thermodynamic ones can be defined for systems 
away from the thermodynamic limit (these will be referred to as ‘small’ 
systems), that satisfy relations analogous to those between the thermody- 
namic state functions. We will now see that, in addition, these functions 
can be given operational interpretation as well, which indicates that one 


can speak of a ‘thermodynamics of small systems’. 


Such a small system is defined in terms of its Hamiltonian or, equiva- 
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lently, its set of energy levels {£;}, and a mixed stationary state of the 
system is described by the canonical ensemble, with 6 as a parameter 


that is, up to now, an arbitrary one. 


A mixed state described by a canonical ensemble is often referred to as a 
Gibbs state. Commonly, however, this term is reserved for infinite systems, 
for which Gibbs states are defined in terms of a limiting procedure (see 


section 5.6). 


For such a system in a Gibbs state with parameter (, let us consider a 
process sufficiently slow so that the system continues to be in a Gibbs 
state, though the energies £; (i = 1, 2,---) and the corresponding weight w; 
may get changed due to energy exchange with external systems. Consid- 
ering an infinitesimally small change in the parameters in such a process, 
the change in the mean energy of the system can be expressed as a sum 


of two parts: 


a 


d({E) = d()_\ w;Ei) = 5 (wid E; + Ejduy), (2-29) 


of which the first part is due to the shift of the energy levels alone without 
any change in the weights, and will be interpreted as the work performed 
by the system (SW, where 5 denotes an imperfect differential). The re- 
maining part, resulting from a change in the weights associated with the 


energy levels, can then be interpreted as the heat supplied to the system 
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(it reduces to the actual heat transferred, in the thermodynamic limit): 
6Q = d(E) — 6W =d(E) — ) wid; = S— E,du;. (2-30) 


On the other hand, the entropy change in the process is (as defined 
by (2-26) 


dS = —kgd }(w;lnw)). (2-31) 
On noting that >>, w; = 1, >, dw; = 0, this gives 
6Q = kpBdS, (2-32) 


which, once again, is analogous to the corresponding to thermodynamic 
formula, where 6 is interpreted as (kg7)~! (see below). In order, now, 
to give an operational significance to the parameter (6 (relating it to the 
temperature 7 as mentioned), it is necessary to consider interactions of 
such a system with other systems, where it is useful to assume the in- 
teractions to be of short duration, in which case these are referred to as 


‘collisions’. 


We consider collision between two such systems ‘A’ and ‘B’ in states p., 
and /p.p (not necessarily Gibbs states; we omit the ‘hat’ symbols over op- 
erators for the sake of brevity) before the collision. We assume that there 


is no interaction between the systems before the collision, when the com- 
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posite system made up of ‘A’ and ‘B’ is in the product state 


PoaB = Pca ® Pcp: (2-33) 


This signifies that there is no initial correlation between ‘A’ and ‘B’. The 
entropy of the composite system is then S(p.,)+5(pcp), Which follows from 
the definition of the quantum mechanical entropy and that of a product 
state. If, now, a collision takes place, the two systems become correlated 
so as to produce the composite state, say, p.\p. After the collision, the 
interaction ceases (but the correlation may remain), and we can consider 
the description of either of the two systems without regard to the other. 
Such a description is possible by means of of the reduced state of ei- 
ther system, where the two reduced states are defined in terms of partial 


traces: 


pea = Trp(Peap), Pep = Tra(Pcap): (2-34) 


The entropies of the reduced states (referred to as the reduced entropies; 
these may be interpreted as the entropies of the two systems ajfter the 


interaction) can then be shown to satisfy the inequality 


S(pex) + S(Pcg) 2 S(Pca) + S(Pcp), (2-35a) 


i.e., in an obvious notation, 


AS, + ASB > 0. (2-35b) 
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In other words, the total entropy increases due to the interaction. This is 
entirely due to the fact that the left hand side of the above inequality has 
been obtained by ignoring the correlation between ‘A’ and ‘B’ that may 
remain even after the interaction ceases. The mean energy of the two 


interacting systems taken together, however, remains conserved: 


A(E), + A(E)p = 0. (2-36) 


In this section I follow the presentation in chapter 9 of the book ([110]) by 


Asher Peres, which constitutes a landmark in modern quantum theory. 


Based on these results and a few mathematical properties of the quantum 
mechanical entropy of a state, one can arrive the following important 


consequences. 


1. Considering the collision between a system (the one we fo- 
cus on) in any chosen initial state and a second system 
(which we term a reference system) in a Gibbs state with 


parameter (, the inequality 


A(S — B{E)) > 0, (2-37) 


holds (note that the parameter § does not pertain to the 
system under consideration but to the one with which it 
is assumed to collide). Hence if the system undergoes re- 


peated collisions with reference systems, all in Gibbs states 
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with parameter (3, then the quantity 5(S — 3(F)) never de- 
creases, and eventually approaches a limiting value that 
maximizes the expression S— (EF), subject to the constraint 
>), wi = 1 (we assume that there are no other constraint op- 
erating). On making use of the method of Lagrange multi- 
pliers one then concludes that, in the limit, the system itself 
approaches a Gibbs state with parameter 3, when it can be 
said to be in thermal equilibrium with any reference system 


in a Gibbs state with the same value of the parameter (. 


Considering an interaction between two systems in Gibbs 
states with different values of the parameter (6 and making 
use of the concept of heat transfer introduced by means 
of (2-32), one finds that the flow of heat always takes place 
from the system with a higher value of 6! to the one with a 
lower value. We can then interpret, in analogy with thermo- 
dynamic systems, the parameter 3~' as being proportional 
to the temperature. In order to maintain analogy with the 
thermodynamic entropy S, we finally arrive at the interpre- 


tation 


O* = ker, (2-38) 


which establishes correspondence between (2-32) and the 


thermodynamic definition of entropy. 


Finally, a correspondence with the second law of thermody- 
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namics is established as follows. We consider a system that 
undergoes a cyclic process through a series of steps where, 
at each step, it interacts with a reference system (a ‘reser- 
voir’) in a Gibbs state, where the parameter $ may have 
different values for all these reference systems. At the end 
of the cyclic process the system returns to its initial state, 
with the same value of the entropy, while the sum of the 
changes of entropy for all the reservoirs, calculated in ac- 
cordance with formula (2-32) turns out to be non-negative: 
the total entropy of the ‘universe’ either remains the same 


or increases. 


In the thermodynamic limit, the above results can be interpreted as being 
indicative of actual thermodynamic behavior of the system under consid- 
eration in its interactions with other systems with which it exchanges 


energy in the form of heat and work. 


2.1.5.3 The minimum principle for the free energy 


Consider two mixed stationary states with slightly differing probability 
distributions, one with w; = 0; = ge-°"' (Z = Ye 8st = 0,1,2,---) (Le., 
the canonical distribution with temperature T = (kp@)~'), and the other 
with w; = w; +¢;, where ¢«; (i = 0,1,2,---) are small variations in the w,’s 
(°,¢; = 0). In other words, one of the two distributions is a canonical one 
with temperature 7 while the other differs from it by small deviations in 


the weights w;. 
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Now evaluate the expression /' = U—T’S for the two distributions, with U = 
Yo, iwi, S = —kp So; w;lnw; for either of the two (choosing the respective 
w;'S as above), where U stands for the mean energy and S for the entropy, 
and where there is no reference to the system size. In evaluating F for the 


second distribution, you will have a zeroth order term which is just F = 


>>, (Ei; — 8-'w; In w;), ie., the free energy corresponding to the canonical 
distribution. In addition, there will be higher order terms, of which we 
will focus on terms up to the second order of smallness, i.e., terms of the 


second degree in the ¢;’s. 


One can now check the following statement: the first order variation of 
F vanishes, while the second order variation is positive (check this out; 
you will need to make use of the fact that }°, «; = 0 (corresponding to the 
normalization of the weights w,), and that at least some of the ¢;’s are non- 
zero, which means that the second of the two distributions is distinct as 


compared to the first, i.e., the canonical one). 


In other words, among all stationary states near one described by a 
canonical distribution with a given value of 6 (recall that V, VN are assumed 
to be fixed all along), the ‘free energy’ as defined above is a minimum for 


the canonical state. 


In the thermodynamic limit, this corresponds to following result: Con- 
sider an equilibrium state with specified values of T,V, N but with no ad- 
ditional constraints, and another equilibrium state with the same values 
of T,V,.N but with additional internal constraints. If now the internal con- 


straints are removed, and the system is allowed to attain equilibrium with 
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the specified values of 7, V, NV then the free energy (defined as F =U —TS 


for any equilibrium state) will decrease in the process. 


Note the difference between this principle of minimum free energy and the 
principle of maximum entropy referred to in sec. 2.1.5.1 where, in the lat- 
ter, one compares the entropies of two equilibrium states with the speci- 
fied values of U,V, N (instead of T, V, NV), one with no additional constraints 
and the other with additional internal constraints. In other words, no en- 
ergy exchange in any form is to take place between the system and its 
surroundings. In the present context, this requires that U = (EF) is to be 


held fixed, in addition to V, NV. 


2.1.6 The quantum mechanical grand canonical ensem- 


ble 


The grand canonical ensemble describes a mixed stationary state of a sys- 
tem enclosed within a specified volume V where, in addition to the energy, 
the number of particles also varies from one member of the ensemble (an 
imagined copy of the system under consideration) to another. Physically, 
this corresponds to a system that exchanges energy and particles with 


other systems by interacting with the latter. 


While the energy of the system in a stationary state corresponds to an 
eigenvalue of the Hamiltonian operator (H), the number of particles can 
be specified in terms of a number operator N whose eigenvalues are 
non-negative integers. The two operators H and N commute with each 


other, and pure stationary states with specified values of energy and par- 
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ticle number are their common eigenvectors of the form |F£;(V)) (VN = 
1,2,---;7 = 1,2,---), where £;(V) stands for a typical energy eigenvalue 
for an assembly of N particles in the volume V (unless stated otherwise, 
the index i will be assumed to label the energies from the lowest energy 
upward). Since the system exchanges both energy and particles with 
other systems, its time-invariant states are mixtures of |E;(V)) for vari- 
ous different values of N and £;(N). Of particular relevance is a mixture 
involving all possible values of the two, in which the weight w;(N) of a 


state with particle number N and energy £;(V) is given by 
w;(N) = + exp[-6(Ei(N) — wN)], (2-39a) 


where the normalization involves the grand partition function 


(oe) 


Z,= >_> exp[-B(Ei(N) — uN)], (2-39b) 


N=0 7 
and where two constants 3, 4 make their appearance; their significance 


will be examined below. Formulae (2-39a), (2-39b) define the grand 


canonical ensemble of equilibrium statistical mechanics. 


1. In the above sum over JN, one has to assume that the term with N = 0 
corresponds to only a single energy F(0) = 0, the energy of the vacuum; 


all other energies are defined with reference to this vacuum energy. 


2. The state space obtained as the direct sum of the Hilbert spaces formed 
with the eigenvectors of Hamiltonians specified in terms of the various 
possible values of the particle number (JN) is referred to as the Fock 
space ([2], chapter 2). The Fock space is a very convenient construct 


for describing systems of identical particles (bosons or fermions), where 
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the number of particles is not specified beforehand. 


The grand canonical density matrix, which is diagonal in the representa- 


tion with |£;(V)) as the basis vectors, can then be expressed as 
Pe = Zz d, d. exp[-G(Ei(N) — wN)I|EW(N))(E(N)I, (2-40) 
in terms of which the von Neumann entropy Sq is given by 
Sq = —Tr(p¢, In pg) Pe Dwsl w;(N) Inw;(N (2-41) 
The mean value of an observable A, in the state described by p, is 


(A).= Tri (p,A Pe Dwsl wi(N) Ain, (2-42) 


where A;y (N = 1,2,---;4 = 1,2,---) stands for the diagonal element 
((E,;(N)|A|E;(V))) of the matrix representing A in terms of the basis vectors 
|E;(N)). In particular, the mean energy and the mean particle number are 


given by the expressions 


(B) = 7 FUN) exvl-(F(N) ~ HN] (2-43a) 
(N) = Zo Newl-AlBi(N) ~ uN). (2-43b) 


While the grand canonical ensemble is defined in terms of the parameters 


6 and yp, one can equally well choose (F), (NV) as the defining parameters 
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in virtue of the above two relations. This brings up the following vari- 


ational problem: looking at a mixed state represented by a stationary 


ry 


ensemble with a density matrix of the form >, 50, wi(N)|Ei(N))(Ei(N) 
where the weights w;(N) (N = 1,2,---;i = 1,2,---) are time-independent 
but otherwise arbitrary, what choice of the set of weights would corre- 
spond to the maximum value of Sg = 5), >>, wi(V) Inw;(N), subject to the 
constraint of specified values of the mean energy (F) = )°y >>; Ei(N)wi(N) 
and the mean particle number (NV) = 5°, >); Nwi(N)? 


As in the case of the canonical ensemble, the solution to this variational 
problem can be obtained by invoking the method of undetermined multi- 
pliers, when one obtains precisely the weights given by (2-39a), (2-39b), 
in which 6, Gu appear as two undetermined parameters whose values 
turn out to be related to (F), (N) precisely as in (2-43a), (2-43b). In 
other words the grand canonical ensemble conforms to the principle of 


maximum entropy as enunciated in sec. 2.1.2. 


Finally, the grand canonical ensemble leads to the definition of a potential 
(a thermodynamic potential in the thermodynamic limit) referred to as the 


grand potential ({): 


Q = —f "ln Z,. (2-44a) 


Starting from this definition, one can introduce the entropy for the grand 


canonical ensemble by the formula 


an 
s=-=, (2-44b) 


od 
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where T = (kp@)~' (alternatively, and more commonly, one makes use of 
the formula S = kgSq). One can then verify that 2 is related to U = (FE), 
N = (N) (refer to (2-43a), (2-43b)) by 


Q=U-TS— UN, (2-45) 


which goes over to the corresponding thermodynamic formula in the ther- 


modynamic limit. 


It is straightforward to check that NV = (N) is alternatively given by 


Q 
N= my (2-46a) 
Ou 
and that, in addition, 5 is given by 
S = kpSq = —kp )) wi lnwi. (2-46b) 


2.1.6.1 The grand potential: minimum principle 


Analogous to the principle of maximum entropy (sec. 2.1.2) and that of 
minimum free energy (sec. 2.1.5.3) the grand potential defined in (2-44a) 
satisfies a principle of stationarity or, more precisely, one of minimality. 
Thus, considering, a system with given values of V,7,, for which the 
probabilities w;(N) of (2-39a) describe the mixed stationary state charac- 
terized by the potential 2 that goes over, in the thermodynamic limit, to 
the thermodynamic grand potential, imagine a slightly perturbed distri- 
bution described by probabilities w;(N)’ = w;(N)+dw;(N), where dw;(N) (i = 
1,2,---, N =0,1,2,---) are small variations over the w;, subject to the nor- 


malization condition 5°, 5°, d6w;(N) = 0. 
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If one now evaluates 


Ul = Yo DEN wil, N= dD Nwi(NY, s=>)> dwi(N) In wi(NY, 


(2-47a) 


and then 
Q' = U'—-TS' — uN", (2-47b) 


retaining terms up to the second degree in the dw;(V)’s, one will find that 
the terms of the first degree add up to zero, while the terms of the second 
degree add up to a positive quantity (check this out; the terms indepen- 
dent of the 6w;(V)’s add up to 2). This establishes the minimum principle 
for the Grand potential. In the thermodynamic limit, this principle goes 


over to the corresponding thermodynamic principle of stationarity. 


On the other hand, as mentioned above, the principle of maximum en- 
tropy holds when one considers variations in the probabilities w;(V) sub- 


ject to the constraint of fixed (£) and (JN). 


2.1.6.2 Families of grand canonical ensemble: thermodynamic model 


As in the case of the canonical ensemble, the grand canonical ensem- 
ble provides one formally with a thermodynamic model for a system of 
arbitrary size. In this, one interprets V, (E),(N) as variables analogous 
to the corresponding thermodynamic variables (respectively, the volume, 
internal energy (U), and the number of moles multiplied with the Avo- 


gadro number (vA)) and, moreover, defines other relevant variables by 
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considering families of grand canonical ensembles with varying values 
of V,T = (kp@)~', u, since these other variables require partial differen- 
tiation of the grand potential (. with respect to V,7T, and w (with the 
non-thermodynamic variables € assumed to be held constant; refer to 
sec. 2.1.3) as follows: 


OQ AQ AQ 


a= (SP) nw a= (SA) _ “(Ga)ver (2-48) 


In the thermodynamic limit, these represent the usual thermodynamic 
state variables, i.e., respectively, the pressure, entropy, and the mole 
number times the Avogadro number (one can check from the definition 
of p, and Z, and from the third relation in (2-48), that N equals (JV)), 


with 7, being, respectively, the temperature and the chemical potential 
per particle (u = A where A stands for the Avogadro number). All these 
have well defined values in any mixed stationary state described by the 
grand canonical ensemble with specified values of V, 7, 4, even for a ‘small’ 
system (i.e., one away from the thermodynamic limit, with the proviso of 
constant values of the non-thermodynamic variables ¢), and hence may 
be formally referred to as state variables for it. According to the above 


definitions, these satisfy the relation 
dQ = —SdT — pdV — pdN, (2-49) 


analogous to the basic thermodynamic formula (1-2). Though this does 
not have the same relevance as in thermodynamics, still it can be given an 
operational interpretation as in the case of the canonical ensemble (refer 


to sec. 2.1.5.2), thereby giving us a thermodynamic model, where the 
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state variables defined above satisfy all the relations that hold between 
thermodynamic state variables. In particular, the thermodynamic Euler 


formula formula 


Q2U=TS = uN, (2-50) 


is conformed to (check this out; refer to the analogous formula (2-24)), 
even though, for a system away from the thermodynamic limit, the vari- 
ables 0,U,5,N are not, in general, extensive ones (i.e., do not increase 
linearly with V). In the thermodynamic limit, the grand potential is given 


by the formula 


Q =—pV, (2-51) 


which is obtained from formula (2-50) and the formula U = TS —pV + uN, 
where the former holds for systems of arbitrary size while the latter holds 


only in the thermodynamic limit. 


The Euler formula, such as (2-50), derived in one ensemble cannot be ex- 
pected to agree with that in some other ensemble (such as the formula 
U =TS-— pV in the microcanonical ensemble or F' = U — T'S in the canonical 
ensemble) for systems away from the thermodynamic limit, one reason for 
which is that VV cannot be consistently defined in the microcanonical or the 
canonical enseble. All the Euler formula agree, however, in the thermody- 
namic limit where one has to replace the formula U = TS—pV with the more 


general U = TS — pV + uN. 
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Incidentally, the state variable U = (F) (the internal energy in the ther- 


modynamic limit), given by (2-43a), can be expressed as 


OQ 
Uae) = — (Sa) v0 (2-52) 
where, in differentiating with respect to 6, one has to keep 
ry = BH (2-53) 


fixed, where 7 is referred to as the fugacity. It is a state function of quite 


considerable relevance, especially in the thermodynamic limit. 


In providing us formally with a thermodynamic model, the grand canoni- 
cal ensemble extends the scope of applicability of the canonical ensemble 
since it includes the chemical potential as a state variable (in the formal 
sense mentioned above). This extension of scope is made possible by the 
transformation from F' to 2 which, incidentally, is a Legendre transfor- 


mation. 


The grand canonical ensemble provides one with a particularly useful 
means for the calculation and explanation of characteristic features of 


quantum mechanical systems (see, for instance, section 3.3.3). 


2.2 The classical equilibrium ensembles 


I have introduced the equilibrium ensembles of quantum statistical me- 


chanics in section 2.1, where these were defined as mixed stationary 
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states made up of eigenstates of the Hamiltonian operator, with appropri- 
ate sets of weights (i.e., probability distributions) assigned to these eigen- 
states. The microcanonical ensemble, the canonical ensemble, and the 
grand canonical ensemble, introduced in sections 2.1.1, 2.1.5, and 2.1.6 
acquire relevance (along with a number of other ensembles in certain spe- 
cific contexts) for thermodynamic systems (i.e., for systems in the ther- 
modynamic limit or, from a physical point of view, ones close to it) when 
these describe the thermodynamic equilibrium states of these systems 
and lead us to state functions that agree with experimentally determined 
state functions and satisfy the standard thermodynamic relations among 


themselves. 


Away from the thermodynamic limit, the ensembles lose their thermody- 
namic significance, though they continue to remain legitimate quantum 
mechanical objects corresponding to stationary density matrices, and the 
question arises as to whether these could then have any relevance what- 
ever. Indeed, subsystems of large systems, described by means of families 
of canonical or grand canonical ensembles, conform to relations among 
state variables (defined in a formal sense) analogous to thermodynamic 
ones even though they are not thermodynamic systems themselves, and 


may even be made up of small numbers of particles. 


Analogous ensembles may be introduced in the classical context as well, 
where one starts from distribution functions over points in the phase 
space, but then effectively goes over to probability distributions over pure 
states associated with fundamental phase cells into which the phase 


space is imagined to be partitioned. The physically relevant quantities 
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that emerge in such a description are the probabilities associated with 


given regions in the phase space. 


More precisely, consider a distribution function p({q}, {p}) in the phase 
space T (refer back to section 1.3.2.2) which gives the relative (i.e., un- 


normalized) probabilities of sets of points in this space. 


We introduce a notational simplification here. Instead of {q¢}, {p} denoting 
the sets of position and momentum co-ordinates in the phase space (re- 
spectively, q,q2,:-: ,ds and pi, po,--- ,p; for a system with s number of de- 
grees of freedom), we will use the notation Q, P to represent these sets of 
co-ordinates, while the volume element d!*Iqd!*lp = dq,dq2- - - dqsdpi dp - dps 
will be denoted by dQdP. 


The symbol Q is also used in the thermodynamic context when referring to 
the heat absorbed or given out by a system. This, however, need not cause 


any confusion with the notation introduced above. 


Thus, for an ensemble corresponding to the distribution function p({q}, {p}), 
p(Q, P)dQdP gives the relative probability associated with a volume ele- 
ment dQdP in I which, however, corresponds to a volume element widQdP 
in the true phase space [ (where we refer to a system of N number of iden- 
tical particles, for which s = 3N). The number of fundamental phase cells 
- each of volume measure hh?” - associated with this volume element in 
[is s7srdQdP. In other words, the relative probability associated with 
the pure states within a volume element dQdP in I can be taken to be 


saw P(Q, P)dQdP (refer back to sections 1.3.2.2 and 1.3.2.4). 
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Since the total probability associated with all the pure states in the entire 
space I has to be normalized to unity, the probability (or the weight) 
associated with the pure states within any given region R in the phase 


space is seen to be 


1 1 
wr => i awh. P)d Qu (2-54a) 


where the denominator Z, referred to as the (classical) partition function 
corresponding to the distribution function p(Q, P) we started with, is given 


by 
1 
2= | <aawel@, PdQuP, (2-54b) 


and where the integration in (2-54b) extends over the entire phase space 


r. 


While p(Q, P) is the underlying object of interest, it is wr (for every pos- 
sible choice of the region R) that is of more direct relevance. On the face 
of it, it might seem that the factor ;j4n gets canceled in the right hand 
side of (2-54a) (refer to (2-54b)). However, as we will see with the vari- 
ous choices for the distribution function p, the partition function 7 leads 


to a set of relevant thermodynamic state variables for the system under 


1 


wiz too is indeed of substantial rel- 


consideration, and hence the factor 


e€van»nce. 


Recall that the factor shes had to be introduced in order to account for, 


first, the fact that the particles making up the system under consideration 
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were identical ones and, secondly, the fact that the non-zero finite value 
of the Planck constant (h) precludes the possibility of pure states in the 
phase space with arbitrarily small separation between them. This, how- 
ever, is only a halfway measure and the classical theory remains as one of 
limited scope, though still of wide applicability. Results derived from the 
classical theory fail to agree with observations under conditions where 
the more complete quantum theory assumes relevance, since the indis- 
tinguishability of particles and the finite non-zero value of h are taken into 
account in the quantum formulation in a consistent manner. This aspect 
of the classical theory, namely its validity in a limiting sense within a 
limited (but still wide) area of discourse will become apparent as we move 
ahead (refer, for instance, to sections 3.3.4, 4.2.1, and to applications of 


classical statistical mechanics in chapters 4, 6). 


Given a distribution function p and an observable A for the system un- 
der consideration, one obtains a distribution of measured values of the 
observable, where the mean value (denoted by (A)) and the variance (as 
revealed by a large number of measurements of the observable in the 


mixed state under consideration) are given by the formulas 


4) = | xqawA(@, P)o(Q, Pagar (2-55a) 


(AA)? = ((A- (4))*) = 1 f spew AQ, Po(Q, Pagar 


3 / wl, P)o(Q, P)dQaP)”, (2-55b) 
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where, in both these expressions, the integration is to be assumed to 


extend over the entire phase space I. 


2.2.1 The classical microcanonical ensemble 
2.2.1.1 Microcanonical ensemble: basic considerations 


Consider, in the classical description, a system isolated from the rest 
of the universe, with energy lying within a narrow range, say, between U 
and U+06U (6U << U). The microcanonical ensemble of classical statistical 
mechanics, which describes the equilibrium state of such a system (in the 


thermodynamic limit) corresponds to the distribution function 


p(Q, P) = 1 (for all (Q, P) satisfying U < H(Q, P) <U+0U) 


= 0 (for all other points), (2-56) 


i.e., to a constant value of the relative probability associated with points 
in the phase space for which the energy lies within the range U to U + 
dU. The region in the phase space corresponding to this range of energy 
values will be denoted by Ry (as we will see, the exact value of 6U is, 
in a sense irrelevant in the thermodynamic limit; according to the above 
formula, p is the characteristic function of the region Ry in the phase 
space). This means that the mean value of an observable A(Q,P) (as 
revealed in a large number of observations with the system in the same 


state in each observation) is given by 


(A) =z i: aw AQ, P)dgar, (2-57a) 
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where Z,,, microcanonical partition function, is given by 


1 


In other words Z,, is nothing but the number of phase cells within the 
region Ry, i.e., the number of pure states associated with this region un- 
der the assumed partitioning of the phase space into fundamental phase 
cells. In the following, this will also be denoted by the alternative symbol 


W: 


W = Zn. (2-57c) 


As mentioned in section 1.3.1, the pure states of a macroscopic system 
are referred to as ‘microstates’, as opposed to ‘macrostates’, the latter 
being a term commonly denoting thermodynamic states of the system, 
i.e., mixed states determined by only a few thermodynamic parameters. 
In terms of this designation, W denotes the number of microstates in the 
accessible region of the phase space, this being the region Ry determined 
by the constraint imposed on the system — that of a specified value of 


the energy (to within a narrow range U to U + 6U). 


This description of the microcanonical ensemble is completed by the fol- 
lowing definition of the ‘entropy’ of the corresponding mixed stationary 


state in terms of the number of microstates W in the accessible region of 
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phase space: 


S=kglW, (2-58) 


this being analogous to the relation (2-2) postulated in the quantum con- 
text. This definition acquires physical relevance in the thermodynamic 
limit when S represents the thermodynamic entropy of the system under 


consideration. 


The postulate that S represents the thermodynamic entropy is motivated 
by the observation that the latter is an additive state function, its value 
(S) for a composite system made up of two independent subsystems being 
given by S = S, + S2, where S|, Sj stand for the entropies of the subsystems. 
The number of microstates, on the other hand, is a multiplicative variable 
whose value for the composite system satisfies W = W,W2, where Wi, W2 
denote the number of microstates in the accessible phase spaces of the two 
subsystems. This, in turn, is a consequence of the fact that the energies of 
the subsystems (say, U), U2) uniquely determine the energy of the composite 
system (U = U, + U2) and, since the energy is the only constraint (other than 
V and N) in specifying the microcanonical ensemble, the accessible phase 
space of the composite system is just the union (in the sense of set theory) 


of the accessible phase spaces of the two subsystems. 


Given the definition (2-58) of the state function S in terms of U,V, N, and 
given a family of states with continuously varying values of U and V, a 


new state function T can be obtained by differentiation by means of the 
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formula 
2 Os 
T= (yyy (2-59) 


analogous to the definition (2-9) in the quantum context, while the ‘pres- 


sure’ (p) is similarly defined as 


Os 


aye: (2-60) 


p=T( 
It may be mentioned here that, for either the microcanonical or the canon- 
ical ensemble (in the quantum description as also in the classical one), 
the number of particles N is assumed to be fixed, having an integer value, 
and families of these ensembles cannot be referred to for continuously 
varying values of V. Hence, strictly speaking, no analog of the thermo- 
dynamic chemical potential can be defined for states defined by either of 
these ensembles. 
The expression for entropy as a function of U,V, N, is referred to as the 
fundamental thermodynamic relation for the system under considera- 
tion,since it gives the equations of state of the system (such as the ex- 


pression for p in terms of T,V, for any given value of N for a fluid). 


2.2.1.2 State functions from families of microcanonical ensembles 


Consider the state functions S,7T,P, designated respectively as the ‘en- 
tropy, ‘temperature’, and ‘pressure’ respectively defined as above for a 
family corresponding to continuously varying values of U,V of mixed sta- 


tionary states of the system under consideration. The reason why the 
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quotation marks (‘.’) are used in the names of these functions is that 
these are only formally defined ones when the system under consider- 
ation is away from the thermodynamic limit — refer back to sec. 2.1.3 
where the non-thermodynamic variables € were encountered; the energy 
and the other state functions depend on these variables, which are to be 
assumed to be held constant in taking the derivatives in sec. 2.2.1. With 


S,T,p defined as above, one obtains the relation 


TdS = dU + pdV, (2-61) 


analogous to the basic thermodynamic relation for a simple fluid made up 
of a given number of particles. However, functions so introduced do not 
possess thermodynamic relevance for systems of finite size. For instance, 
the ‘entropy’ S does not turn out to be an extensive variable and the Euler 


formula 


U+pV =TS, (2-62) 


is not satisfied unless the thermodynamic limit is conformed to. This is 
in contrast to the canonical and the grand canonical ensembles, where 
the state functions introduced in an analogous manner satisfy all the 
thermodynamic relations (including the relevant Euler formulas), though 
they are still not extensive variables away from the thermodynamic limit 


(refer to sec. 2.1.6). 


The classical microcanonical ensemble differs from the corresponding 


quantum ensemble in that the functions T and p can be introduced even 
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for a finite system for defining the derivatives with respect to U and V (in 
the quantum case, S is a discontinuous function of U and V), though the 
dependence on the non-thermodynamic variables (¢; refer to sec. 2.1.3) 


persists for systems away from the thermodynamic limit. 


2.2.1.3 Microcanonical ensemble: alternative definition 


The microcanonical ensemble can be defined in an alternative way, as in 
the quantum description (refer to sec. 2.1.1.1) by choosing the (unnor- 
malized) distribution function p to be the characteristic function of the 
region Ry of the phase space instead of Ry (fig. 2-2), where Ry is the re- 
gion in the phase space made up of points for which the energy H(Q, P) 


lies within the range 0 to U: 


p(Q, P) = 1 (for all (Q, P) satisfying 0 < H(Q, P) < U) 


= 0 (for all other points). (2-63) 


While it may appear paradoxical that the results derived with the micro- 
canonical ensemble should not depend on whether one chooses Ry or Ry 
as the relevant region, the two definitions indeed turn out to be equiva- 
lent in the thermodynamic limit. An instance of this general result will be 


found below for the classical ideal gas (sec. 3.2.1). 


The result that, in the thermodynamic limit (NV > co, V > ov, fv > finite), the 


microcanonical ensemble leads to thermodynamic behavior, when the state 
functions like the entropy and the internal energy turn out to be extensive 


ones, presupposes that the potential energy function ©® satisfies a certain 
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_- HQ,P)=U+8U 


Figure 2-2: Depicting the regions Ry and Ry in the phase space (schematic; the 
notional phase space I’ of sec. 1.3.2.2 is referred to); the two regions correspond, re- 
spectively, to 0 < H(Q,P)<U and U < H(Q,P) <U+6U (6U << U); the (unnormalized) 
distribution function p(Q, P) defining the classical microcanonical ensemble can be cho- 
sen as the characteristic function of either of these two regions, the two choices being 
equivalent in the thermodynamic limit. 


stability criterion (refer to sectionclass-stab-sec). The ideal gas is on the 


borderline of this criterion and satisfies it in a limiting sense. 


We will employ the expression (2-58) to obtain the fundamental equation 
for an ideal gas in sec. 3.2.1, obtaining therefrom the equation of state 
for the gas — a formula that has been found to agree with the experimen- 
tally observed behavior of actual gases over a quite considerable range of 


situations. 


Referring to formulas (2-56) and (2-63), one can define the microcanon- 
ical ensemble in yet another way by taking W as the number of acces- 
sible pure states with a sharply specified energy value U, which can be 
thought of as the value of W in a narrow energy range, with the width 
of that range going to the limiting value zero. Once again, this is equiv- 
alent to the other two definitions of the microcanonical ensemble in the 


thermodynamic limit. 
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2.2.2 The canonical ensemble: classical 


The canonical ensemble corresponds to the operational specification of 
the stationary state of a system (we continue to consider, for simplicity, 
a simple fluid with volume V and number of particles ') in terms of a 
parameter (6 which, for a thermodynamic system, relates to its thermo- 


dynamic temperature T (refer to formula (2-19)) as 


oe (2-64) 


Operationally, the system is kept in thermal contact with a reservoir at 
temperature T, or is assumed to be in continuing interaction (or, repeated 
‘collisions’) with one or more systems characterized by the same value of /, 
as in sec. 2.1.5.2, in which case one defines (kp3)~! as the ‘temperature’ (T) 


of the system, even when the latter is of a finite size. 


Note that this operational specification does not constrain the energy of 
the system, as a result of which the energy can have all possible values 
(the system under consideration exchanges energy with other systems 
with which it interacts, subject to the constraint of a specified value of (3) 
corresponding to points at arbitrary locations in the entire phase space, 
though the mean value of the energy gets determined by the specified 
value of 6 as we see below. Thus, in contrast to the microcanonical en- 
semble, the entire phase space is accessible in so far as the relevant 


microstates are concerned. 
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The (un-normalized) distribution function p, determining the probabilities 
of microstates associated with any specified region of the phase space, is 


given by 
PQ, -) = exp|—8A(Q, P)|, (2-65) 


where H stands for the Hamiltonian of the system (refer to (1-7)). Thus 
the probability of occurrence of pure state associated with any specified 


region R of the phase space is (refer to (2-54a), (2-54b)) 


1 


1 


where the denominator Z, is now the (classical) canonical partition func- 


tion, given by 


1 H(Q.P 
—B (Q, ) = 
Z. | ware dQdP, (2-66b) 


(it may be mentioned that the Hamiltonian H(Q,P) is at times written 
as E(Q, P) (or, simply, EL, the energy function) as a convenient notational 
convention). The mean energy of the system represented by the canonical 


ensemble is then 
USiie =. = HG P)ePH(@-P) dQdP (2-67a) 
= ~— Zo J ON UABN , 


which, in the thermodynamic limit, represents its internal energy. The 
above formula then gives the internal energy as a function of the variables 


T,V,N pertaining to the system, when one considers a family of canonical 
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ensembles corresponding to variable values of these parameters. One 


finds, as an equivalent expression for U, 


Oln Z, 


eee 


(2-67b) 


Analogous to the entropy defined in terms of the microcanonical par- 
tition function (refer to (2-57b), (2-58)), which acts as thermodynamic 
potential when U,V, N are chosen as the basic state variables, the canon- 
ical partition function leads to the definition of the free energy (F) of the 
system, where the latter plays the role of the thermodynamic potential 
when 7, V,N are chosen as the basic state variables. As indicated in sec- 
tions 1.2.5 and 1.2.6, thermodynamic formulas with T,V,N as the basic 
variables are related to those with U,V, N as the basic variables by a Leg- 


endre transformation. 


The relation between the ‘free energy’ (we use quotation marks in the 
names of the state variables when systems of finite size are included for- 
mally, with the non-thermodynamic variables € introduced in sec. 2.1.3 
assumed to be held constant) and the canonical partition function is of 


the form 


P= —B"Z., Le = e PF (2-68) 


while the formulas defining other thermodynamic functions by means of 
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differentiation are 


(2-69) 


Recall that, as with the microcanonical ensemble, the canonical ensemble 
is defined for a fixed integer N, which limits its applicability in that the 
chemical potential cannot be defined in a consistent manner for any given 
finite N. Within the framework of thermodynamics, however, i.e., in the 


thermodynamic limit, this ceases to be a problem. 


We will see how the canonical ensemble formulas give us relations among 
the thermodynamic variables of an ideal gas in conformity with those ob- 
tained with the microcanonical ensemble where the two are to be consid- 
ered in the limit of large NV and V (the thermodynamic limit). However, as 
mentioned in sec. 2.1.5, the canonical ensemble provides us with a ther- 
modynamic model even for small systems, though the state variables, 
analogous to the corresponding thermodynamic ones, do not share the 
extensivity property of the latter. This applies, in particular, to even a 
single molecule of an ideal gas as the system of interest. The canoni- 
cal ensemble, invoked for this system, leads us, among other things, to 


Maxwell’s velocity distribution formula. 


As a simple example, we consider a 1D harmonic oscillator in equilibrium 


at temperature 7’, where the Hamiltonian is the classical version of (2-27), 


1 1 
H(g,o) = a + srg. (2-70) 
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The partition function is given by 


=5f- ay dp exp [— a + sn? q)], (2-71) 


where we have made use of (2-66b) with N = 1, while replacing rad with 
+,since we are considering a 1D oscillator. Evaluating the Gaussian inte- 


grals, one obtains 
(2-72) 


which agrees with the limiting form of the right hand side of (2-28) as 


h > 0, as indeed it should. 


For later use we note that the mean energy of the 1D classical oscillator 


at temperature T is (refer to (2-67b)) 
U =(H) = kel. (2-73) 


We next consider the example of a single molecule of mass m in equi- 
librium in an ideal gas at temperature T in a volume V. Though the 
molecule under consideration is identical to all the other molecules of 
the gas, one can, in a certain approximate sense assume that it can be 
treated as a distinct particle (refer to formula (8-60b) in section 3.3.4). 


One then obtains the translational partition function as 


-af dy fe nea. (2-74a) 


(reason this out; r, p denote the position and momentum variables of the 
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particle; 6 = (kgT)~'). Evaluating the Gaussian integrals involving the 
momentum components, each of which varies over the range from —co to 


oo, one obtains 


3 
2 


V ,2mt 


z= aa 3 iF, (2-74b) 


(check this out). From this, we obtain the following expression for the 
probability density P(v,,v,,v,) for the velocity components v,, v,,v, of the 
molecule (which is defined such that the probability that the velocity com- 
ponents lie from v, to v, + duz, vy to vy + du,, v, to v, + dv, per unit volume 
of the gas is P(v,, vy, v.)dv,dvu,dv-) 


m(v2 +u2 +v2) 


Jie 2kpT ; (2-75) 


1 m 


P(e tts) = 7 ake 


(check this out) which is nothing but Maxwell’s velocity distribution for- 


mula. 


Analogous to what was observed in the quantum context (sec. 2.1.5.3) the 
classical canonical ensemble also satisfies the principles of minimum free 
energy and of the maximum value of the entropy subject to the constraint 
of a specified value of the mean energy. More generally, the equilibrium 
ensembles satisfy variational principles involving the thermodynamic po- 


tentials. 


Before explaining the classical grand-canonical ensemble (sec. 2.2.5), we 
outline below an important application of the canonical ensemble de- 


scribing the equilibrium configuration of a classical system at any given 
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temperature T — namely, the principle of equipartition of energy. 


2.2.3 Principle of equipartition of energy 


Consider a system described by a classical Hamiltonian of the form 


A(Q, P) = H'(w) 4+ Av’, (2-76) 


where it is important to get the notation straight. The system under 
consideration is assumed to be one with s number of degrees of freedom 
(s = 1,2,3,---), with Q denoting collectively all the s number of position co- 
ordinates and P being similarly all the momentum co-ordinates denoted 
collectively. Among these, u stands for one particular phase space co- 
ordinate (either a position or a momentum co-ordinate), while w denotes 
all the other 2s — 1 number of phase space co-ordinates taken together. 
One can, for instance, identify Q,P as q,q@,---ds,P1,P2,'°: ; Ps, and w as 
91; 92;°°* Ys; P1; P2,°** »Ps—1, in Which case u would correspond to p, (other 
choices for Z,u are evidently possible). In the above expression, \ is as- 


sumed to be a real positive parameter. 


In other words, the Hamiltonian H(Q, P) is assumed to be made up of two 
independent parts, of which one is a positive quadratic term involving the 
phase space co-ordinate u while the other can be an arbitrary function 
of the remaining phase space c-ordinates. The total Hamiltonian, how- 
ever, is assumed to satisfy the stability and temperedness criteria (refer 
to sec. 5.2) required for thermodynamic stability of the system consid- 


ered. Since, generally speaking, @, P can be generalized position and mo- 
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mentum co-ordinates (such as angles and angular momenta), we specify 


explicitly that the range of variation of u is to be from —oo to oo. 


The fact that the second term on the right hand side of (2-76) is inde- 
pendent of the first, i.e. wu is distinct from the phase space co-ordinates 
making up w (together, u, w constitute the complete set of phase space co- 
ordinates Q, P) implies that, if the system is in equilibrium at temperature 
T = kpG-', then the mean energy is of the form 


[du e Baw 
—oo 


[Ree (2-77a) 


(H) = (H") 4 


In this formula, (H) represents the mean energy of the system in equilib- 


rium at temperature 7’, and 


(2-77b) 


In explaining the notation, we refer for the sake of concreteness, to the 
particular case mentioned above where Z stands collectively for the set 
of phase space co-ordinates q, q2,--- ds; P1,P2,°** ;Ps—1 (and u stands for p,). 


With this choice, dw represents the (2s — 1)-dimensional volume element 


dw = dq, ---dqsdp, --- dps, 


and the integral on the right hand side of the above formula covers the 
entire range of variation of q@,q2,---4s,P1,P2,°** ;Ps-1. One can then for- 
mally interpret the first term on the right hand side of (2-77a) as the 


mean ‘energy’ of a system with ‘Hamiltonian’ H’(w), though the latter is 
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not actually a Hamiltonian since the position co-ordinates in it are not 
all paired with conjugate momenta. In the notation adopted here, the full 


phase space volume element dQdP is given by 


dQdP = dwdu. (2-77¢) 


Check formula (2-77a) out. Note that 


(f dwduH' (w)e~ BA +au")) 4. (f dwdudu2e~8H'+ru")) 


(H(Q,P)) = f dwdue~ 8H +40?) 


The right hand side decomposes into two terms, in the first of which /[ due Pv 
cancels out from the numerator and the denominator, while in the second, 


itis [ dwe~®"’ that cancels out. 


The Gaussian integrals in the numerator and denominator of the sec- 
ond term on the right hand side of (2-77a) are explicitly known, and one 


obtains 
1 
(H) = (H') + 5 hel. (2-78) 


It is straightforward to see that one can generalize to the case where 
r (r = 1,2,--- ,2s) number of phase space co-ordinates occur in the ex- 
pression for the Hamiltonian H(Q,P) as additive quadratic terms of the 
form A;u? (i = 1,2,--- ,r), with all the coefficients \; real and positive, and 


with the range of variation of u; (¢ = 1,2,--- ,r) covering the interval —oo 
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to co. In that case, one obtains 
1 
(H) = (H'(w)) + aT kel. (2-79) 


In this formula, w stands collectively for all the phase space co-ordinates 
other than u,,:-- ,u, and H'(w) denotes the part of the Hamiltonian H that 


depends on these other co-ordinates. 


One interprets this result as saying that all the r number of phase co- 
ordinates of the type mentioned above - each occurring in the total Hamil- 
tonian solely as a positive quadratic term (with range of variation from 
—oo to oo) — contribute equally to the mean energy, each such contribution 


being $kpT, this being precisely what is meant by the term ‘equipartition’. 


The principle of equipartition is a very convenient and useful one within 
the context of classical statistical mechanics, and enjoys a surprisingly 


wide application in numerous situations of practical interest. 


The simplest instance of the principle in action is provided by the classi- 
cal 1D harmonic oscillator, for which the Hamiltonian (2-70) is made up 
solely of two independent quadratic terms of the type mentioned above, 
which implies that the mean energy of the oscillator should be kg7, in 


agreement with (2-73). 


In this context, it is of interest to mention the case of a 1D anharmonic 


oscillator, described by the Hamiltonian 


1 1 
H(q,p) = oan - srry +aq* (a> 0). (2-80) 
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The partition function can be worked out in terms of Bessel functions 


akpT 
mew 


(refer to [85]). In the case of weak anharmonicity ( << 1), the mean 


energy works out to 


3akpl 
m?2w4 


(H) = kpT(1— e (2-81) 


One obtains an extension of sorts of the principle of equipartition for the 
quartic oscillator with the Hamiltonian 


1 
H(q,p) = ne +aq'. (2-82) 


The partition function is obtained by making use of the result [”,. e1' dq = 
sah) (7 > 0), where ['(u) = f,° e-t""'dt, the gamma function with argu- 
* 


ment u, is well defined for u > 0. The mean energy works out to 


(H) = “eT. (2-82b) 


where a contribution of skpT comes from the quadratic term in (2-82a) 
(principle of equipartition) while another contribution of shkpT is obtained 
from the quartic term. This leads to the result that each independent 
quartic term of the form \u* (\ > 0) in the Hamiltonian gives a contribution 
of kT to the mean energy where u denotes a phase space co-ordinate 


with a range of variation from —co to ov. 


2.2.4 The virial theorem 


In classical mechanics, the expression 1 = >”, p;- r; for a system of 


particles is referred to as the virial. Considering a trajectory of the system 
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in the phase the relation 

d 

J Slt pit Bi ri), (2-83a) 
holds at every point on the trajectory. On taking the time average of both 


sides (denoted by an overbar on the relevant expression), we have 


lim | di— = ai pit pi- Til, (2-83b) 


T-00 Jo d 


The left hand side is zero for a periodic orbit in the phase space and also 
for a bounded, but not necessarily periodic, orbit (reason this out). If F; be 
the instantaneous force on the ith particle of the system (i = 1,2,--- , N), 


then one has 


=o p2=-Si(F-r). (2-83¢) 


i.e., 
— | —— 
K= 5 Fi -¥4), (2-83d) 


where Kk stands for the kinetic energy of the system under consideration. 


While eq. (2-83d) refers to time averages, it implies an analogous relation 
between phase space averages as well with reference to the appropriate 


equilibrium ensemble, assuming that the ergodic hypothesis holds good. 


Move ahead to sec. 9.2 for necessary background. The ergodic hypothesis 


states that, under the Hamiltonian dynamics of the system under consid- 
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eration, all the phase space cells are sampled with a uniform probability 
in the long rum. This, essentially, is what the mcrocanonical ensemble is 


based upon. 


This gives us, finally, 
1 
(Ky ==5(> (Fit). (2-83e) 


Either of eq. (2-83d), and eq. (2-83e) constitutes the statement of the virial 
theorem, though the latter form is more commonly used in statistical 


mechanics. 


The force f; can be split into two parts: 
F,= FO 4 FO (= 1,2,---,N), (2-84) 
where EO), the internal force, can be expressed as 


EP Sy, am (2-85) 


ja OM 


under the assumption of a two-body central force acting between the 


particles, with u(r) denoting the interaction potential for a separation r. 


The external force FO), on the other hand, is exerted on the 7th particle 
(i = 1,2,--- , N) by the wall of the containing vessel (we consider, for the 
sake of concreteness, a simple fluid as our system of interest, though the 


results hold more generally, such as, for a many-component fluid). One 
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then has 


So (Fe -r,) = —p ¢ ie (2-86) 


where p stands for the pressure exerted against the wall of the vessel, and 
the integral is over the closed surface area of the latter (n stands for the 


unit outward normal). On making use of the Gauss’ divergence theorem, 


one obtains 


So (FR - ri) = —3pV, (2-87) 
where V denotes the volume of the system. One then obtains the virial 


theorem in the following form ([107], chapter 8) 


/ 


3 1 Ou 
(K) = PV a ie "iDe, (2-88) 


a,j 
(check this out). In the case of an ideal gas, the second sum on the 
right hand side reduces to zero, while, by the equipartition principle, 
(K) = 3NkpT, whereby one obtains the ideal gas equation of state pV = 
NkpT. For a real gas at a low pressure, the said second term gives the 
correction to the equation of state arising due to the interaction among 
the gas molecules. A systematic procedure for working out the equation 


of state of a dilute gas in the form of a series expansion, referred to as the 


virial expansion will be outlined in sec 4.1. 


A more general form of the virial theorem can be stated as follows. Let 


2,2; (i,j = 1,2,---,6N) be specified components of the 6N-component 
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phase space variable made up of the position and momentum vectors 
of the particles of a system, described by a Hamiltonian H(z), where z 
stands for the collection of all the said 6N number of components. One 


then has ([140], chapter 3, [67], chapter 6), 


(oe) = kp i; (2, j = 1,2,- --6N), (2-89) 
j 


where 7 stands for the temperature of the system in equilibrium. 


As a particular instance, taking z;,z; both to stand for any of the three 
components of the momentum (we denote it by, say, p; (J = 1,2,3)) of any 


specified particle of the system, one obtains 
Pi, _ 1 
(-) = skal, (2-90) 


where a factor of $ has been included in both sides. One recognizes this 
as the statement of the equipartition principle (sec. 2.2.3) from which, 
summing up for all the three momentum components of all the NV number 


of particles one obtains, for the internal energy of an ideal gas, 
3 
U= 5 Nksf, (2-91) 


which reproduces eq. (3-24) obtained earlier. 


Further, taking z;, z; (in (2-89)) as some specified component (1, 2, 3) of the 


position vector of any specified particle and then summing up for all the 
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three components of all the N number of particles, one obtains 


N 
(Sor; - pi) = —3NkpT. (2-92) 


i=l 


which is a re-statement of (2-83e) obtained earlier. 


2.2.5 The grand canonical ensemble: classical 


Now that we have settled matters of notation, and are familiar with the 
approach of statistical mechanics in the use of stationary ensembles 
in defining the state variables for a system and obtaining the relations 
among those, I will quickly go through the relevant statements relating to 


the grand canonical ensemble of classical statistical mechanics. 


For a system confined within a given volume (V), two parameters 6 and 
are specified so as to define the grand canonical ensemble, for which the 


normalized distribution function p(Q, P) is given by 


(Q, P: N) = 5 exp|-6(H(Q, PN) — uN), (2-93) 


g 


where the number of particles NV making up the system is no longer fixed, 
but is considered as a variable, so that the microstates of the system 
under consideration are defined in an extended phase space in terms 
of Q,P, and N. In this expression, Z, stands for the grand canonical 


partition function given below in (2-94c) below. 


With the parameters $ and yp fixed, the mean energy and mean particle 
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number are obtained from the formulas 
1 1 
_ ee es : —[B(E(Q,P;N)—-ELN)| es 
U=(E)= Z, » / Nin E(@, P: Nye HN dQaP. (2-94a) 
and 
1 1 7 -N)_ 
N 


where the (classical) grand canonical partition function is given by 


i= 3 / — e BEGPN)-BN Qa P. (2-94c) 
N=0 
In the above formulae, the summation is assumed to be over all possible 
values of the particle number N (now considered as a variable, distinct 
from the thermodynamic variable NV characterizing an equilibrium state), 
and E(Q, P; N) (equivalently, H(Q, P; N)) stands for the value of the Hamil- 
tonian function for a system of N particles (NV = 1,2,3,---), where the po- 
sitions and momenta of the particles are (qi, q2,--- , ds), (P1,P2,°°: ;Ps) (S = 
3N), in the 6N dimensional phase space I'(V) for a system with N parti- 


cles (which is now a variable number). 


The grand canonical ensemble can thus be defined alternatively in terms 
of U and WN (in addition to V) instead of the parameters 6, since the 
two sets of parameters are related by (2-94a), (2-94b). Operationally, it 
corresponds to the system under consideration (confined within a given 
volume V) interacting with a thermal reservoir and with a particle reser- 


voir characterized by a given temperature (T = (kp3)~') and a given chem- 
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ical potential (u) per particle. Even for a ‘small’ system away from the 
thermodynamic limit the grand canonical ensemble provides us with a 
thermodynamic model in a manner analogous to the canonical ensemble 
(refer to sec. 2.1.6.2). Though, for such a system, (6 and y are not pos- 
sessed of thermodynamic significance, still these can be interpreted in 
operational terms by considering interactions (or ‘collisions’) of the sys- 
tem with other systems characterized by the same values of (, (refer to 


sec. 2.1.5.2). 


Analogous to the relation between the free energy (a thermodynamic po- 
tential) and the canonical partition function (and also to the relation be- 
tween the entropy and the micrcanonical partition function) one has the 


relation 


2 =-f"ln Z,, (2-95a) 


where () is the grand potential related to F' by the Legendre transformation 


Q=F-puN. (2-95b) 


The relations between the state functions defined in the quantum me- 
chanical grand canonical ensemble and mentioned in sections 2.1.6, 2.1.6.2, 


all remain valid in the classical context as well. 


Looking at formulas (2-39b) and (2-94c), one finds that in both the quan- 
tum and classical contexts, the grand canonical partition function can be 


expressed as a Sum over canonical partition functions for various fixed 
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values of the particle number NV: 
Ze) iN ZAIN) = >. a ZN). (2-96) 
N=0 N 


In this formula, Z.(V) stands for the canonical partition function for a 
system of N particles (refer to formulas (2-15b), (2-66b) where the num- 
ber of particles N was assumed to be a parameter with a specified integer 


value), and y = e*“ is referred to as the fugacity. 


Finally, the classical grand potential 9 defined as above satisfies a min- 
imality principle that can be stated in terms analogous to the corre- 
sponding principle outlined in the quantum context in sec. 2.1.6.1 (see 


sec. 5.6.3 where the principle is further explained). 


2.2.6 Isobaric ensembles 


The microcanonical, canonical, and grand canonical ensembles are of 
fundamental relevance in statistical mechanics, while the isobaric ensem- 
bles are of immense practical relevance from the point of view of interpret- 


ing experimental observations and of studies in molecular dynamics. 


In molecular dynamics, one makes a model of a real-life system, assum- 
ing a sufficiently realistic interaction among its constituent particles, and 
numerically simulates its non-equilibrium dynamics and equilibrium con- 
figuration (in accordance with an appropriate equilibrium ensemble) under 
assumed conditions. With the advent of unprecedented enhancement of 
computational power and efficiency in recent decades, along with the ap- 


pearance of greatly improved computational algorithms, molecular dynamic 
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has acquired the status of a discipline of major significance in statistical 
physics, especially in studies on condensed systems. It now complements 
and fosters foundational and theoretical explorations in statistical physics 
and, in turn, is enriched by such explorations. In this book, however, we will 
not have occasion to refer to this vast field of study — one of great relevance 


in statistical mechanics. For an introduction, see [140]. 


As for equilibrium statistical mechanics, experimental data on thermo- 
dynamic properties of systems are routinely reported with reference to 
conditions of constant pressure (and, frequently, of constant tempera- 
ture too). Such data are conveniently interpreted in terms of isobaric 
ensembles, in which the pressure (py) appears as a parameter conjugate 
to volume (analogous to temperature in relation to energy and chemical 
potential in relation to the number of molecules of a component). In ad- 
dition, molecular dynamics simulations in the isobaric ensembles lead to 
results that can be compared with data obtained theoretically and exper- 


imentally. 


As indicated earlier on several occasions in this book (see, in particu- 
lar, sections 1.2.5, 1.4), the behavior of macroscopic systems can be 
described in terms of alternative but equivalent sets of thermodynamic 
variables, related to one another by Legendre transformations. Corre- 
spondingly, these an be explained by invoking one or other among sev- 
eral possible equilibrium ensembles of statistical mechanics, where all 
these ensembles become equivalent in the thermodynamic limit. Each 


of these various ensembles corresponds to the thermodynamic descrip- 


127 


CHAPTER 2. THE EQUILIBRIUM ENSEMBLES OF STATISTICAL 
MECHANICS 


tion in terms of one among the various possible sets of thermodynamic 
variables, depending on the constraints assumed to be imposed on the 
system where, in the end, the constraints become irrelevant because of 


the equivalence emerging in the thermodynamic limit. 


Continuing to refer, for the sake of concreteness, to a simple fluid as our 
prototype system, we have seen that the ensemble corresponding to spec- 
ified values of U,V, N (describing an isolated system) is the microcanonical 
one. If we now assume that, instead of being isolated, the system inter- 
acts with a ‘pressure reservoir’, say, by being surrounded by gas in a 
very large vessel maintained at a constant pressure — the external pres- 
sure for the system under consideration, the latter being separated from 
the gas in the reservoir by the walls of a container with a freely movable 
part, then the variables H,p, N are conveniently made use of in describ- 
ing the state of the fluid. This is because U and V are, strictly speaking, 
fluctuating quantities now because of the interaction with the pressure 
reservoir though such fluctuations are microscopic in nature and become 
insignificant in the thermodynamic limit. Instead, the pressure p (the ex- 
ternal pressure, that is, which equals the fluctuating ‘internal’ pressure 
on the average) and the enthalpy H, defined by the Legendre transforma- 
tion (from U(S,V, N) to H(S,p, N)) 


H=U+pvV, (2-97) 


along with N now constitute a set of parameters that adequately describes 
the thermodynamic state of the system. The functional relation deter- 


mining the enthalpy in terms of S,p, N now constitutes the fundamental 
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thermodynamic relation, from which one obtains 


OH 7, y_ OH OH 


ae = = 2- 


where Y stands for the mean volume (the volume itself is a fluctuating 
quantity in the present context) and , for the chemical potential per 


molecule. 


Recall that, strictly speaking, N is not a thermodynamic variable while, v = 
x, the mole number, is one. Making use of N formally as a thermodynamic 


variable, one obtains correct results in the thermodynamic limit. 


The phase space distribution corresponding to specified values of H, p, NV 
is referred to as the isoenthalpic-isobaric ensemble. The distribution func- 
tion pu»,v is a constant on the constant enthalpy surface (or, more pre- 
cisely, in a shell corresponding to a narrow range of values of the en- 
thalpy). However, in defining the shell, one has to make use of the Hamil- 
tonian for some specified value of V which, strictly speaking, is a fluctu- 
ating variable in the present context. Thus, on defining the distribution 
over the shell for a specified value (V) of the volume, one has to addition- 


ally consider a uniform distribution over all possible values of V. 


The partition function characterizing this distribution is given by [140] 
Zip. =N a dV / dp, +» dO pyd@r,---d ryd(H(z) —H+ pV), (2-99a) 
0 
where N is a normalization constant (see below) and where, notably, there 
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is an integration over the fluctuating volume V. The phase space in- 
tegration is to be performed before the volume integration, i.e., for any 
specified value of V, the integrals over rj,r2,--- ,ry are to be evaluated 
for a volume V, while H(z) is also to be taken as the Hamiltonian corre- 
sponding that volume (note that, additionally, the integrand contains V 
explicitly). As regards the notation, z stands collectively for the position 
and momentum vectors of all the particles, i.e., for all the phase space 


co-ordinates taken together. 


The normalization constant is given by 


Eo 


where Ey and Vo are constants chosen to make the partition function di- 
mensionless, whose values, however, have no effect on the results derived 


on the basis of the partition function. 


With this expression for the partition function, one obtains the various 
thermodynamic parameters by referring to the entropy S that can be ob- 
tained as a function of H,p, N by inverting the fundamental relation spec- 


ifying H(S,p, NV). With the entropy given by the Boltzmann formula 
S= kp In Zip,N3 (2-100) 


one obtains 


1 aS V_ aS p _ as 


T OH T dp’ T ON’ ey) 


where Y stands for the mean volume in the ensemble. 
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Compared to the isoenthalpic-isobaric ensemble described above, the 
isothermal-isobaric ensemble is often of more direct relevance. The inde- 
pendent thermodynamic variables here are T,p, NV, and the fundamental 
relation expresses the Gibbs free energy G in terms of T,p, N. The latter 
is obtained by a Legendre transformation from S(U, V, NV) by means of the 


Legendre transformation 
G=U+4+pV -TS (=H-TS). (2-102) 


The partition function of the isothermal-isobaric ensemble at inverse tem- 


perature £ is given by ([140], [107]) 
1 [o-e) 
= —BpV - 
Z8.p,N = ara aVe FP’ ZAT.V,N), (2-103) 


where Vp is once again an appropriately chosen constant (whose actual 
value is not relevant in determining the thermodynamic parameters of 
the system under consideration) to make the partition function dimen- 
sionless and Z,(6,V, N) is the canonical partition function at inverse tem- 
perature £. 


The Gibbs free energy is obtained from Z,;,, as 
G(B,p, N) = —p* In 2 p,.N; (2-104) 


while the other relevant thermodynamic variables are given by 


OG OG OG 


Z Se epee 2-1 
aT’ Op’ aN P02) 
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Finally, the heat capacity at constant pressure is obtained as [140] 


fag 
Cy = ke om In 28. p,N + (2-106) 


2.3 Equilibrium ensembles and thermodynam- 


1CS 


2.3.1 Equilibrium ensembles: a brief overview 


The equilibrium ensembles refer to probability distributions over the mi- 
croscopic states of a system, and provide the link between microscopic 


and macroscopic descriptions of its equilibrium configurations. 


An equilibrium state is determined by a number of macroscopic con- 
straints on the system under consideration. It is described in microscopic 
terms by means of a density matrix, say, ( (a probability distribution over 
the phase space in the case of a classical system; we will, in the present 
section, refer to the quantum mechanical description — analogous state- 


ments hold for the classical theory). 


Generally speaking the constraints relate to one or more conserved quan- 
tities characterizing the system and can be grouped into two categories: a 
constraint can specify the value of a conserved quantity itself, such as the 
energy (U), the volume (V), or the particle number (JN), or it can specify 


its mean value. 


The symbol U denotes the internal energy in the thermodynamic descrip- 
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tion. 


In this book a simple fluid is used as the default system for illustrating 
the basic principles of statistical mechanics. More complex systems met 
with in real life can be brought under the purview of the theory without 
the necessity of invoking new principles of fundamental relevance, though 
ingenious theoretical innovations prove to be essential in the interpretation 


of their thermodynamic behavior. 


As we have seen, the former type of constraints results in a uniform prob- 
ability distribution over the pure states compatible with the constraint 
data. The latter on the other hand, leads to a Gibbs type probability dis- 
tribution. The unifying principle underlying the Gibbs type distributions 
is a variational one that we summarize as follows (refer to [2], chapter 4, 


for a cogent exposition). 


It is straightforward to check that the extremum principles (maximum en- 
tropy, minimum free energy, minimum grand potential) stated in respect of 
the ensembles introduced in sec. 2.1 are all subsumed under this common 


unifying principle. 
Consider a set of observables A; whose mean values (4A;) (i = 1,--- , K, say) 


are specified as constraints. If 6 be the equilibrium ensemble compatible 


with these constraints, then the constrained entropy (up to a multiplica- 


133 


CHAPTER 2. THE EQUILIBRIUM ENSEMBLES OF STATISTICAL 


MECHANICS 
tive factor) 
K 
§ = —Tr(é np) — $— As Aj) — AoTrA, (2-107a) 
i=1 
is to be a maximum among all possible choices for 6. Here po, A1,--- , AK 


are a Set of Lagrange multipliers that are to be determined from the con- 
dition of stationarity of the constrained entropy (28 = 0 7 = 01x), 


along with the constraint conditions 


Trp =1. (2-107b) 


Among these equations of constraint, the one in the second line of (2- 107b) 
corresponds to the normalization of the density matrix we are looking for. 


The solution to this problem is 


1 - 
p= sen Di A (2-108a) 
Z 
Z=6e°", (2-108b) 
which, however, is a formal one in that the constants ); (i = 1,--- , A) and 


Z (or, equivalently, \)) remain undetermined. In order to complete the 


solution, one needs to append the following results to (2-108a), (2-108b). 
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First, Z is determined in terms of the ); (i =1,--- ,A) as 
Z(= et) = Tr(e7 Dea Ad), (2-109a) 


The multipliers \; are then determined from the equations (obtained from 
the condition of stationarity of the constrained entropy, with Z given 


by (2-109a)) 


nZ= =A), (2-109b) 


(note that these form a set of implicit equations). For instance, in the 
case of the grand canonical ensemble for a simple fluid, there are two 
conditions of constraint involving the mean values of H and N, and two 
multipliers 6,—(y (additionally, there is the constraint corresponding to 


a specified volume V). 


It can also be checked that, subject to these values of the undetermined 
multipliers and the corresponding equilibrium distribution p, the con- 
strained entropy S is actually a maximum when compared with other, 
neighboring, distributions over pure states that deviate from the station- 


arity condition. 


The equilibrium entropy (to be identified with the thermodynamic entropy 
in the thermodynamic limit — see chapter 5, and refer to sec.2.3.2 below) 


is then given by 


K 


$ = ke[InZ~ >In Z], (2-110) 
i=1 : 
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With the equilibrium ensemble given by (2-108a), the above expression 


for the equilibrium entropy reduces to the familiar expression 


S = —kpTr(/ln f). (2-111) 


Finally, the condition for the maximum value of the constrained entropy 
implies a relation of symmetry between the sets of variables {5, (A;)} and 
In Z, A; (« = 1,--- , K) conforming to a Legendre transformation where, cor- 


responding to (2-109b), one obtains 


1 as 
kp OCA, 


=}; (i=1,---,K). (2-112) 


nL 


Referring, for instance, to the canonical ensemble where there is just one 
multiplier 4; = 6 (with (A,) = Tr(#H) = U, the mean energy), the results 


summarized above appear in the form 


La oh, (2-113a) 
OlnZ 

=- 2-113b 

OB U, (2-113b) 
1 Os 

ae =P (2-113c) 


where «¢; (i = 1,2,---) denote the energy eigenvalues (arranged, say, in in- 
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creasing order). All these formulae are familiar by now. 


The fact that the constrained entropy is a maximum for the equilibrium 
ensemble obtained above corresponds, in the thermodynamic limit, to 
that of thermodynamic stability whereby a number of static response 
functions are found to have definite signs. Related to this is the fact that 
the entropy is a concave function of the parameters A;,--- , Ax (U,V, N in 
the case of a simple fluid). Referring to the Legendre transformation men- 
tioned above, it can be checked that /nZ is, likewise, a convex function of 


the parameters \; (i= 1,--- ,K). 


In the above paragraphs we have referred to the equilibrium ensembles 
without reference to the microscopic dynamics determined by the Hamil- 
tonian of the system. Referring to the time evolution of a mixed state (equa- 
tions (8-35d) and (8-103), in the classical and the quantum contexts respec- 
tively), the question arises as to whether and in what sense the probability 
distribution describing the mixed state evolves to an equilibrium distribu- 
tion, say, to the one corresponding to the microcanonical ensemble in the 
case of an isolated system. Boltzmann took an early step in answering 
this question (in the classical context) by way of formulating the ergodic 
hypothesis (section 9.2.2; refer back to sec. 2.2.4) that explained why the 
equilibrium ensemble had to be one corresponding to a uniform probability 
distribution in the phase space, but left unanswered the question as to why 
and how the equilibrium ensemble is approached at large times. Another 
effective step in the right direction was taken by Gibbs by way of introduc- 


ing the notion of mixing (sec. 9.4; the ideal gas enclosed in a rectangular 
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volume with smooth walls constitutes an instance of an ergodic system, but 
one that is not of the mixing type). The question of justifying the equilib- 
rium ensembles will be taken up in chapters 9, 10. From another point 
of view, the equilibrium ensembles are justified by a successful interpreta- 
tion of the principles of equilibrium and near-equilibrium thermodynamics 


(sections 2.3.2, 8.5) from these. 


2.3.2 The microscopic interpretation of thermodynamic 
principles 


Equilibrium thermodynamics (or, simply, thermodynamics) begins by rec- 
ognizing two distinct modes of energy transfer to and from a macroscopic 
system, namely work and heat. The former is effected by bringing about 
changes in the volume, mole number, and similar other parameters like 
the strengths of applied fields, if any, that define the equilibrium config- 
uration of the system under consideration. We denote the parameters of 


this kind by symbols Aj, A»,--- or, collectively, by A. 


When the internal energy U is included in the set {A} of configurational 
parameters (in which case it is denoted by the symbol Ap for the sake of 
uniformity of notation which, however, differs slightly from that in sec. 2.3.1 
above), one obtains a set of parameters in terms of which the fundamental 
thermodynamic relation S = S(A) can be set up. In the case of a simple 
fluid, the parameters U,V, N make up such a basic set. This is the default 
system to be considered in most instances in this book. In the following, the 
internal energy U will not be included in the set of parameters {A}, unless 


stated explicitly. 
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When either work or heat is given to a system, it leads to an increase in 


its internal energy, the microscopic expression for the latter being 
U = Tr(pH). (2-114) 


If the Hamiltonian H remains unchanged owing to the parameters {A} 
being held constant, while the distribution / over the energy eigenstates 
is made to undergo a small change 46/, the change in U is identified as the 


heat transferred to the system under consideration: 
dQ = Tr((6A)H), (2-115a) 


where, in mathematical terms, the symbol dQ stands for an inexact dif- 
ferential. In a similar manner, considering an infinitesimal change 6H in 
the Hamiltonian (brought about by a small change in the parameters {A}) 
whose only effect on the energy levels of the system is a small shift in each 
of these (there being a one-to-one correspondence in the energy levels be- 
fore and after the change), and assuming that the probability distribution 
p over the energy eigenstates remains unchanged, we identify the work 


done on the system as 
dW = Tr(A(6H)). (2-115b) 


One thereby obtains the first law of thermodynamics in the special case 


of an infinitesimal change that can be looked upon as the resultant of two 
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independent infinitesimal changes, as 


6U = dQ + 6W. (2-116) 


A sequence of such infinitesimal changes that proceeds through a suc- 
cession of equilibrium states makes up a quasi-static process. One can 
work out the work done on the system under consideration and the heat 
transferred to it by integrating over the infinitesimals 6W,dQ, in which 
case one finds that the first law holds for such processes as well. In the 
case of an irreversible process, however, the work done (W) or heat trans- 
ferred (Q) cannot be expressed in terms of thermodynamic parameters 
of the system in question, but have to be worked out with reference to 
parameters of external systems (large systems acting as work reservoirs 
and heat reservoirs) in which case the first law is again satisfied in the 


form 


AU =Q+W. (2-117) 


Thermodynamic parameters such as temperature (7), pressure (p), and 
chemical potential () are defined by referring to families of ensembles 
(we consider, in particular, the grand canonical ensemble for a simple 
fluid) as in sec. 2.1.6.2, and then invoke the equivalence between the 
microcanonical and the grand canonical ensembles in the thermody- 
namic limit. In this limit, the entropy appears as an extensive param- 
eter, while being a concave function of U,V,N. Referring, then, to the 
maximum property of the entropy, one can check that the parameters 


T,p, 4 have the appropriate characteristics expected of these on physical 
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grounds, as indicated in sec. 2.1.4. One can alternatively (and equiva- 
lently) identify these same parameters in terms of the Lagrange multipli- 
ers (\; (¢ = 1,2,---), refer back to sec. 2.3.1) and adopt a similar approach 


to establish their physical significance. 


As a simple instance of how the equilibrium ensembles of statistical me- 
chanics lead to thermodynamic properties of a system, the fundamental 
relation of the classical ideal gas, relating its entropy with the internal en- 
ergy, volume, and mole number (or, equivalently, the number of particles N) 
will be derived in sec. 3.2, on the basis of these equilibrium ensembles. As 
can be verified in a straightforward manner, all the thermodynamic proper- 
ties of the ideal gas, which closely approximate those of a dilute real gas, 


follow from this fundamental relation. 


The microscopic interpretation of the principle of increase of entropy em- 
bodied in the second law of thermodynamics is obtained on the basis 
of the maximum principle of the constrained entropy, as outlined in 
sec. 2.3.1. While this principle is invoked to work out the ensembles 
describing equilibrium states, it can be made use of in explaining the in- 
crease of entropy in an irreversible process from an initial state ‘i’ toa 
final state ‘f of an isolated system by interpreting ‘i’ as an equilibrium 
state under a set of constraints in addition to the ones defining ‘f. The 
principle of increase of entropy in the spontaneous process taking the 


system from ‘i’ to ‘f on release of the additional constraints then follows. 


For a more complete account of the interpretation of the laws of ther- 


modynamics in microscopic terms based on the equilibrium ensembles 
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of statistical mechanics, see [108], [2], [46]. The explication of thermo- 
dynamic principles from the perspective of the eigenstate thermalization 
hypothesis, will be outlined in chapter 10, sec. 10.3.5 where the content 


of the present section will be briefly recalled. 
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Chapter 3 


The Ideal Gas in the Classical 


and Quantum Theories 


3.1 Sum of independent Hamiltonians 


We begin this chapter by considering a system described by a Hamilto- 


nian of the form 
H=H® +H, (3-1) 


where the H (i = 1,2) are independent of each other and commute 
among themselves. In this case the eigenvectors of H can be expressed as 
direct products of eigenvectors of the ‘component’ Hamiltonians H” (i = 
1,2), and the corresponding eigenvalues turn out to be the sums of eigen- 
values corresponding to the factors in the direct products. Thus, if |B) 
and |) denote eigenvectors of H, 1), with eigenvalues E!), £), then 


a typical eigenvector of H is of the form of a direct product |B) ® |B, 
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with eigenvalue BO + EO), where the indices a,b label the eigenvectors 


and the corresponding eigenvalues of the two Hamiltonians. 


In this case the canonical partition function at temp 7 of the system 
described by (3-1) is given by (we drop the suffix ‘c’ on canonical partition 


functions for the sake of brevity) 
Z = Slew [BEY + BP) = leh Sie Pm, (B-2a) 
a,b a b 


, 1.e., 
ave (3-2b) 


In other words, the partition function decomposes into the product of the 
canonical partition functions of the ‘component’ systems described by the 
Hamiltonians 1“, H®), 

For the sake of easy reference we say that we now have a ‘composite’ 
system made up of two independent ‘components’, even though the so- 
called components may not constitute physically identifiable distinct con- 
stituents of the system. All we need is that the composite Hamiltonian 
be a sum over the component Hamiltonians, which should be mutually 
independent and should commute among themselves. An equivalent way 
of expressing this requirement is to say that components be distinct from 


one another. 


The above requirement on the component Hamiltonians is ensured if these 


are assumed to operate on distinct vector spaces, say, V), V@). From a 
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technical point of view, formula (3-1) is then to be replaced with the more 


explicit form 


H = A® g@7% 47% Q Ae), (3-3) 


which is an indication of the fact that H operates in the product vector space 
VO @ VY), Here [, 7?) are unit operators in the spaces V“), V(), while 
HO eI), 1) @H®) are operators in the product space. We will, for the sake 
of brevity, continue to refer to the simpler form (3-1), while keeping in mind 
that (3-3) represents the more correct and explicit form of the expression for 


Hf, 


State vectors in V“) @ V°) are linear combinations of product vectors of the 
form Ja‘ 6)) = ja) @ |B@)) with the first and the second factors belong- 
ing to V and V@) respectively. Assuming that the eigenstates |E{)) (a = 


0,1, 2,404), BO), (b = 0,1,2,---) form complete orthonormal sets in VY, y®), 


the product vectors { JEM B®) form a complete orthonormal set in V“ @V®) 


(this, indeed, constitutes the definition of VY @ y)), 


As mentioned above, the Hamiltonians H“), H?) are said to refer to inde- 
pendent components, or ‘subsystems’ of the system described by H. One 
can extend the definition to include more that two subsystems making up 
a composite system. For a composite system made up of any number of 
independent subsystems described by commuting Hamiltonians (such sys- 
tems are, at times, referred to as ‘separable’ [111]), the energy eigenstates 
are products of energy eigenstates of the individual subsystems, and the 
corresponding energy eigenvalues are sums of the eigenvalues for the re- 
spective subsystems, provided that the subsystems are distinguishable, in 
which case one can refer to these subsystems by means of distinguishing 
labels (such as the labels ‘(1)’ and ‘(2)’ in (3-3)). The case of indistinguishable 


components will be considered separately in subsequent sections. 
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The subsystems need not correspond to physically distinct objects but may 
refer to distinct modes associated with a given physical system —- for in- 
stance, these may be the electronic, vibrational, and rotational modes of a 
single molecule, or the various normal modes of oscillation of a field enclosed 
within a given volume, or of a crystalline lattice; these may even correspond 
to various excitations of the normal modes (photons in an electromagnetic 
field, or phonons in a crystalline solid). Result relating to specific instances 


will be briefly outlined below. 


While the above considerations relate to the quantum context, the for- 
mula (3-2b) itself remains valid in the classical context as well. Once 
again, all that is needed here is that the Hamiltonian of the ‘compos- 
ite’ system under consideration be of the form of a sum over ‘compo- 
nent’ Hamiltonians, that depend on independent sets of phase space c- 


ordinates. 


The above result (eq. (8-2b)) can be immediately generalized to a com- 
posite system made up of kK number of independent components (kK = 
1,2,---). Denoting the canonical partition functions of the components by 
Z,...,Z(), the partition function of the composite system is given by 


the product of the component partition functions, 


K 
Ga FG [[ 2. (3-4a) 


k=1 


A special instance of (3-4a) — one of quite considerable relevance — re- 
lates to a system for which the components Hamiltonians all have identi- 


cal sets of eigenvalues, or pertain to phase spaces of identical structures 
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in the classical context. The classical ideal gas made up of N number 
of identical particles constitutes an important example. In this case, all 
the partition functions Z”),.-- ,Z“) are the same. Denoting this common 


value by z, we obtain 


Z=2*. (3-4b) 


This formula implies that the free energy and the mean energy (the internal 
energy in the thermodynamic limit) of a system made up of kK number of 
independent components, all having the same set of energy eigenvalues (or 
the same partition function in the classical description), are given by Kk 
times the corresponding values for a single component. More generally, 
the free energy or the mean energy of a system made up of a number of 
independent components is just the sum of the corresponding values for 
the constituent componenets (reason this out). This principle of additivity 
of free energy and mean energy applies to the entropy too (reason this out). 
If, however, the said components are all identical as in the case of an ideal 
gas, then one needs a modification of (3-4b), as we see below, where one 
needs to take into account the fact that the microstates of the system are to 


be invariant under a permutation of its constituents. 


Generalizing further, let us assume that the Hamiltonian of the system 
under consideration be made up of Kk (Kk = 1,2,---) number of indepen- 
dent Hamiltonians H”,.-- ,H“®) (q,.-. ,H@) in the classical context), 
each of which is, in turn, the sum of a number of independent but iden- 


tical components. Thus, if H), or H“), as the case may be, is the sum of 
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N; independent but identical components (k = 1,--- , XK), then we have 
Z= ot. (OOM =N). (3-4c) 
k 


In formulas (3-4b), (3-4c) above, the lower case letters (z or z,, zx) refer to 
the identical components of a single species or of a number (kK) of species, 
where the system under consideration may be made up of a number of 


distinct species. 


An important variant of (3-4b), applicable to the classical context applies 
to a system made up of N number of identical particles, an interchange 
among which does not generate a new microstate, such as an ideal gas 
consisting of N number of identical molecules moving about in a given 
volume V (refer to sec. 3.2.2.1 below). In this case one obtains, instead 


of (3-4b), the following expression for the partition function, 


= (3-5) 


where the factor of N! in the denominator arises from permutations among 
the identical particles, since such permutations do not produce distinct 
microstates. In contrast, formula (3-4b) applies to a system of distinct 
constituents (classical or quantum mechanical), all having the same par- 


tition function z. 


The formula (3-5) does not apply to the quantum context since the quantum 
mechanical concept of identical particles differs from the classical concept. 


While the particles can, in principle, be distinguishable in the classical the- 
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ory, they are indistinguishable in the quantum description, which brings 
in additional constraints on the wave functions arising from symmetry re- 


quirements. 


We briefly consider below a number of applications of the above results 
along with those following from the principle of equipartition which, how- 
ever, applies to the classical context alone. At times, we will leave implied 
the the nature of the system considered (classical or quantum mechani- 
cal), which will be clear from the context. We will have to keep in mind 
that the classical formulas can be obtained from the more general quan- 
tum mechanical ones in a limiting sense. This will be seen to be particu- 
larly relevant in the context of the ‘partially classical systems’ discussed 


below (sec. 3.1.1). 


3.1.1 ‘Partially classical’ systems 


Referring to formula (3-2b), it may so happen that one can use the classi- 
cal approximation for the calculation of one of the two partition functions 
2, z@) (we adopt the notation that partition functions of ‘component’ sys- 
tems are all denoted by lower case symbols), while the other is to be 
calculated in full quantum terms. More generally, it is possible that 7 
breaks up into a product of several factors (eq. (3-4a)) among which the 
classical approximation can be adopted for some while the rest of the fac- 
tors require quantum mechanical considerations. Such systems will be 


referred to as ‘partially classical’ ones. 
A surprisingly large number of systems of practical interest can be de- 


149 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


scribed as partially classical in this sense. For instance, consider a dilute 
gas of non-interacting molecules where each molecule, apart from hav- 
ing translational degrees of freedom, may have other ‘internal’ degrees 
of freedom as well, like those corresponding to vibrational and rotational 
motions of the molecules. Even the nuclei of the atoms constituting the 
molecules are dynamical systems, with energy levels of their own. We 
first assume that the molecules can be treated as identical but distin- 
guishable particles, in which case the partition function of the gas, made 
up of N number of molecules in a volume V, can be expressed in the 
form (3-5), where z stands for the partition function of a single molecule. 


As explained in sec. 3.3.4, this requires that the condition 


< (8, (3-6) 


be satisfied (refer to (8-6GOb)). One then needs to evaluate the partition 
function (z) of a single molecule of the gas. It is here that the internal de- 
grees of freedom are to be taken into account, along with the translational 
degrees of the molecule. All these degrees of freedom can be assumed, in 
an approximate sense, to be independent of one another so that one can 


write, for instance, 
H = Htrans i Fyot = Ayin + Ha + Hues (3-7a) 


for the molecular Hamiltonian where the successive terms on the right- 
hand side correspond respectively to translations of the molecule, molec- 
ular rotations, molecular vibrations, motions of electrons in the molecule, 


and motions within the nuclei, like those of the nucleons and of the 
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quarks and gluons, and so on, and where some of these are to be treated 
as classical and some others as quantum mechanical, depending on cir- 
cumstances. The partition function of the system then factorizes into a 
product of partition functions with a factor coming from each individual 
term for the molecule. One thereby arrives at a convenient and useful 
scheme for working out quantitative results for dilute gases. In other 


words, 


2 = 4trans “rot “vib “el “nuc> (3-7b) 


where the notation is self-evident. Now, it so happens that the average 
spacing between the energy levels resulting from each term in (3-7a) is 
usually orders of magnitude smaller compared to that from the succeed- 
ing term. While the average spacing for Hans» computed on the temper- 
ature scale (through division by kg) is typically 107!°K, that for Ayu. is 
larger than 10’K. 


In this context, there exist well-defined temperature ranges in which the 
condition for the validity of the classical approximation (in which the par- 
tition function can be approximated as a phase space integral) is satisfied 
for some of the factors in (3-7b), while the other factors do not lend them- 
selves to the classical approximation. For instance, at temperatures of 
the order of a hundred degrees Kelvin, sufficiently high-lying levels of only 
Hecans are excited, with the probabilities of occupation of all but the lowest 
levels resulting from the other terms being negligibly small. In working 
out the statistical mechanics of the system in this temperature range, it 


is a useful and convenient approximation then to use the actual quantum 
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mechanical expressions for the partition functions other than 2,.,;, even 
going so far as to ignore altogether the degrees of freedom correspond- 
ing to, say, H,;, onward, and to consider only the lowest-lying rotational 


levels. As for zrans, it is convenient to use the expression (2-74b). 


Recall that the probability of occurrence (or ‘excitation’) of a state with en- 
ergy E in the canonical ensemble is « e~°”; at any given temperature, the 


probability decreases as one goes to higher energies. 


At somewhat higher temperatures, one may apply the classical approxi- 
mation to 2z,., (which is equivalent to assuming that sufficiently high-lying 
rotational levels are excited) while continuing to ignore the excitations 
corresponding to the remaining degrees of freedom, at best considering 
only the lowest few vibrational levels. The scheme may be continued to 


still higher temperatures in essentially a similar manner. 


We illustrate the above basic principles by looking at the effect of rota- 
tional degrees of freedom on the statistical mechanics of diatomic molecules, 
assuming that the vibrational degrees are ‘frozen in’. However, as we will 
point out in the case of homonuclear molecules, the nuclear spin degen- 


eracy plays a role that cannot be ignored. 
The scheme of approximation referred to above follows the basic idea un- 


derlying the Born-Oppenheimer approximation ([2], chapter 8), also termed 


the adiabatic approximation. 
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3.1.1.1 Rotational partition function for diatomic molecules 


Consider a heteronuclear diatomic molecule in equilibrium with a heat 
bath at temperature 7. The rotational energy levels of the molecule are of 
the form 

h? 


E, = allt + 1), (3-8) 


where / stands for the angular momentum quantum number of the molecule 
and J represents an effective moment of inertia. Each level is (2/ + 1)-fold 


degenerate. The rotational partition function is given by 


oe n2 W(1+1) 
zrot = Y(Ql-+ Ue, (3-9) 


=0 


~ 


where the degeneracy factor is included since the partition function is 
defined as a sum over states (and not over energy levels). 


h2 


At any temperature T satisfying kgT << > 


it suffices to keep only the 


terms / = 0,1 in the sum, which gives 


2 


Zeon 2 1+ Be” FBT, (3-10a) 


Note that, at sufficiently low temperatures even the second term can be 
dropped (and so, z,.; * 1), with only an exponentially small error. 
On the other hand, for kpT >> cm the summation in (3-9) can be replaced 


by an integral, and one obtains [85], 


QkpT 
Zrot = 5% (3-10b) 
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which constitutes the classical approximation for z,,,. One could work 
out this approximation by referring to the classical Hamiltonian function 
for rotational motion in terms of relevant angular variables and the corre- 
sponding angular momenta. The former turn out to be cyclic coordinates 
and the latter occur quadratically and additively, the range of variation 
of each of these being from —oo to oo. The relevant phase space integrals 
can then be evaluated, leading, precisely, to (3-10b) (the factor of h in the 
denominator arises as one partitions the phase space into fundamental 
phase cells). The equipartition result holds : there are two rotational 
degrees of freedom, and so the mean energy should work out to kpT. 
This is precisely what one obtains from (3-10b) (check this out; refer to 


formula (2-67b)). 


The partition function of an ideal gas made of N number of diatomic 


molecules can now be worked out from (refer to (3-5)) 


1 
ZL = Fyq (Ztrans@x0t) (3-11) 


where Zirans is approximated by the classical expression (2-74b). This 
amounts to replacing 2, Ze) Znue With unity on the assumption that the 
vibrational, electronic, and nuclear modes are not excited (i.e., are ‘frozen 


in’) in the temperature range under consideration. 


This approximation based on the assumption that some of the modes of 
excitation are frozen in (i.e., energy levels higher than the lowest ones are 
not excited; this is an essentially quantum mechanical consideration) while, 


among the other modes that are excited, some can be treated classically, is 
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a hugely rewarding one, and is almost universally (though implicitly in most 


cases) employed in statistical mechanics. 


Knowing the partition function, all the relevant thermodynamic parame- 
ters can be obtained by differentiation. In particular, the mean energy at 
temperature T is obtained by making use of (2-67b). In the temperature 
range in which the classical approximation can be used for the rotational 
partition function (eq. (3-10b)), the mean energy can be obtained directly 
from the principle of equipartition, and one obtains (taking into account 


the translational and rotational terms in the Hamiltonian) 
5 
fs shel. (3-12a) 
The corresponding expression for the specific heat is then obtained as 


5 molar 5 
Cy © 5Nke, clmelanl 5 (3-12b) 


The number of independent quadratic terms in the Hamiltonian is, at times, 
referred to as the number of ‘degrees of freedom’, in the context of the 
equipartition principle. This interpretation does not always agree with the 
usage in mechanics where the term signifies the number of independent 
variables that completely specify the instantaneous dynamical state of a 


system. 


The reason why we restricted our considerations in this section to het- 
eronuclear molecules only is that homonuclear molecules are subject to 


special symmetry requirements arising from the indistinguishability of 
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the nuclei making these up. This becomes apparent as one looks at the 
statistical mechanics and thermodynamics of ortho- and para hydrogen. 
Here the indistinguishability of the two nuclei and the resulting symmetry 
requirement leads to two possibilities: (i) the total nuclear spin quantum 
number has the value © = 0 (spin singlet), owing to which the angular 
momentum quantum number / characterizing the rotational state of the 
molecule can have only even values (/ = 0,2,4,---) — this is the case of 
para-hydrogen; (ii) the total nuclear spin quantum number is © = 1 (spin 
triplet), owing to which the angular momentum quantum number / can 
have only odd values (/ = 1,3,5,---) — this is the case of ortho-hydrogen. 
Making use of these values of the quantum numbers, one can work out 


Zr, for the two varieties, both at low and at high temperatures (depend- 


kpTI 
h2 


ing on the parameter ), and then obtain the partition function and 
the thermodynamic parameters for the pure para- and ortho- varieties 
by proceeding as in the case of the heteronuclear molecule; for details, 
see [85]. If, on the other hand, we have a mixture containing fractions 
x,1—2 of ortho- and para-hydrogen molecules with a given value of z one 


obtains, for the partition function of the mixture containing a total of NV 


number of molecules, 
1 
Z= Fy Zirans (3 (Zot) ortho)” ((r0t)para) 9, (3-13) 


where rans is obtained from (2-74b), and the factor 3 multiplying Zot) ortho 
owes its origin to nuclear spin degeneracy (3 for the triplet state and 1 for 
the singlet). 


The fraction x specifying the abundance of ortho-hydrogen in the mix- 
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ture, however, gets determined under the condition of thermodynamic 
equilibrium [85]. One obtains the result that at sufficiently low temper- 
atures, x — 0 owing to the lower rotational ground state energy of the 
para- variety while, in the high temperature limit, x — 3, as determined 


by the nuclear spin degeneracy. 


The quantum mechanics of sets of indistinguishable particles will be out- 


lined in greater details in sec. 3.3.1 (see also section 7.2). 


Further applications of the basic principles of equilibrium statistical me- 
chanics to idealized simple systems will be found in later sections of this 
chapter. Before that, we will have a look at the statistical mechanics of 
the classical and the quantum mechanical ideal gas. Results relating to 
the ideal gas will be found to provide a good understanding of numerous 


systems of practical interest 


3.2 The classical ideal gas 


An ideal gas is made up of particles that have no mutual interaction and 
are identical to one another. To start with, we assume the particles to be 
point-like entities, in which case the phase space of the system is made 
up of the position and momentum co-ordinates, and the Hamiltonian is 
of the form (1-7) which specializes, in the particular case of the ideal gas 
(6 = 0) made up of N number of particles, to 


2 
a 


N 
Pp 
te, ae (3-14) 
a=1 
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in a notation by now familiar. The only interaction that the particles par- 
ticipate in is the one with the wall of the containing chamber (of volume V) 
causing them to remain confined. We assume that, on encountering the 
wall, a particle suffers an elastic collision with its energy remaining un- 
changed. Alternatively, one may assume periodic boundary conditions at 
the walls of a containing box in the shape of a rectangular parallelepiped. 
In most of the considerations, the boundary conditions do not make an 
explicit appearance in virtue of an (implicit) assumption that the volume 
(V) of the box is sufficiently large (refer to chapter 5 for basic ideas relat- 


ing to the thermodynamic limit). 


3.2.1 The classical ideal gas: the microcanonical de- 


scription 


Referring to the classical microcanonical ensemble, the entropy of the 
ideal gas, with given values of U,V, NV, equals (up to a factor kp) the log- 
arithm of the number of phase cells in the region Ry of the phase space 
(refer to fig. 2-2), corresponding to the total energy of the system lying 
within the range U to U + dU (dU << U; U is interpreted as the internal 
energy in the thermodynamic limit; in what follows, we will comment on 


the energy width dU), where the latter is given by 


il 


— | dQdP. (3-15) 
U 


this being the microcanonical partition function (Z,,) of the system (refer 


to (2-57b)) (recall our notational convention in which Q,P stand collec- 
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tively for the position and momentum variables of the particles). In the 


above expression, the region Ry is given by U < 37” Pi < U +6U and 


i=1 2m 


does not depend on the position co-ordinates Q(= {q@, @,--: ,q@3n}), which 
means that the integration over the position co-ordinates can be per- 
formed without regard to Ry. The threefold integration over the position 
co-ordinates of each of the N particles will therefore give V, the volume of 


the confining vessel, and we will thus have 


VN 
W=——— dp,dp2---d 3-16 
ABN NI [. Piap2 DP3N > ( ) 
where p,,--- ,p3x Stand for the 3N number of components of the momenta 
Pi,::: , Pax. ordered in some appropriate manner. 


The integral here is one in a 3N-dimensional Euclidean space in which 
>>, p; can be interpreted as the squared distance of a point from a cho- 
sen origin, and the region Ry then corresponds to one within a ‘spher- 
ical’ shell of inner and outer radii /2mU and \/2m(U + 6U) in this 3N- 
dimensional space (reason this out). The volume measure of a spherical 
region of radius R in this space can be worked out by introducing polar 
co-ordinates instead of the co-ordinates p,,--- ,p3y (the ‘Cartesian’ ones), 
and is given by the expression 


3N 
dp, ---dp3n = —3y— (3-17) 
ieee r(3% 41) 


(check this out) where ['(u) stands for the gamma function with argument 
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u. Formula (3-16) then gives 


3N 
m2VN 


W —— 
h3N NIP (2X + 1) 


3 


((2m(U + 6U)) > — (2mU)*). (3-18) 


This expression doesn’t tell us much for any arbitrarily specified value 
of N. However, for N large, one can use asymptotic expressions for the 


logarithms of the factorial function and the gamma function to write 


[large N (Stirling’s approximation) :] 


3N 3N, 3N 3N 
In (—+1)® In — 


i~ _ = 
7 7 7 7 InN! NinN—N, (3-19) 


and, moreover, can make use of the approximate binomial expansion 


(2m(U + 6U)) 2 & (2mU) 2 (1+ (3-20) 


so as to finally obtain the asymptotic expression for the entropy S(= 
kp inW) 
5 3, 4rmU, , 3NOU 


V 
flarge N :] 5 = ke[N (5 Inv 5 in Nie) 


(3-21) 


(check this out). One can now see what happens at the thermodynamic 
limit, i.e., for N — co, V > oo. One finds from the above expression that, 
as N > oo, with limy_,. + = v (the specific volume) with a finite v, the 
internal energy U scales with N as an extensive quantity (i.e., limy_,x a = 
u, u finite) if S is so (i.e., the specific entropy lim 2 tends to a finite limit s), 
and vice versa and, at this limit, the assumed value of the energy range 6U 


in the definition of the microcanonical ensemble has no relevance in so far 
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as the values of the thermodynamic functions are concerned, provided 6U 
does not increase with N faster than any finite power of the latter (check 


this out). 


At this point, one can check that the alternative definition of the mi- 
crocanonical ensemble, where the distribution function p is defined as 
in (2-63) rather than by (2-56), leads to the same expression for the en- 
tropy as above in the thermodynamic limit. Based on (2-63) one obtains, 
instead of (3-18), the following formula for the number of relevant phase 
cells: 


3N 
qn2 VN 3N 
2 

? 


W = ImU 3-22 
BYNITGE +1) mr) = 


which involves, instead of the volume measure of a spherical shell of in- 
ner and outer radii V2mU and \/2m(U + 6U) in the 3N-dimensional space 
(with p,,p2,--- ,p3n aS co-ordinates) mentioned above, the volume mea- 
sure of the sphere of radius /2mU. However, in calculating the entropy 
in terms of U,N,V one needs the logarithm of W rather than W itself, 


when one obtains the same expression as that in (3-21), with only the 


3N6U 
2U 


last term within the brackets (ln ) dropping out. But this term has 


been seen to be negligible anyway in the thermodynamic limit. 


Here and in the following we will assume that the width dU entering into 
the definition of the microcanonical ensemble (refer to (2-56)) is small 
compared to U, growing at most linearly with N (recall that, in the ther- 


modynamic limit, U itself grows linearly with NV). One then obtains the 
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following form for the fundamental relation of the classical ideal gas 


[large N :] 5 = Nka(2 + In 


- 
2 N 2° 3Nh?2 ). ae) 


Making use, now, of the formula (2-59), one obtains the following expres- 


sion for the internal energy as a function of the temperature 


v= SNkoT. (3-24) 


On the other hand, the expression for the pressure of the gas is obtained 


as 


OS NkpT 
= T ——— — -2 
p ( ay) U,N V , (3 5) 
thereby leading to the ideal gas equation of state 
pV =VRT, (3-26) 


where, v stands for the number of moles of the gas, related to N as v = 


Nkp 
“Ro 


The results in this section are for the ideal monatomic gas where the 
constituent particles have no internal degrees of freedom. One can also 
consider, for instance, a diatomic gas that is ideal in the sense that 
the molecules have no mutual interaction while, at the same time, the 
Hamiltonian function for the gas involves additional terms as compared 
with (3-14), corresponding to vibrations and rotations of the molecules. 


A number of introductory ideas about ideal gases with internal molecular 
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structures have been outlined in sec. 3.1.1.1, though with reference to 


the canonical ensemble. 


3.2.2 The classical ideal gas: canonical and grand canon- 


ical descriptions 
It is now not difficult to quickly go through the description of the classical 
ideal gas in terms of the canonical and the grand canonical ensembles. 
3.2.2.1 The canonical ensemble description 


The canonical partition function for the ideal gas at temperature T = 


(kg3)—' is given by the expression 


N 
__! B 2 
Z. = jaryy | #arenl-5,, Pil (3-27) 


which, on evaluation of the spatial integrals and the Gaussian momen- 


tum integrals, gives, for the free energy function F, 


VN (Inm, 3x 


- poBF 2 
Zo = €"" =r (a) 
Vy" 1 
~ NT BN? (3-28) 
where 
h 

Ar = ————.,, (3-29) 

J2amkpl 


is referred to as the thermal de Broglie wavelength. As we will see, the 


thermal de Broglie wavelength, which gives an estimate of the linear di- 
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mension of the region within which a particle of mass m can be assumed 
to be localized when it is in interaction with a thermal bath at temper- 
ature T, is of great relevance in determining the behavior of a fluid at 
temperature 7. The ratio ru will be seen to be the characteristic param- 
eter that demarcates the classical behavior of the fluid from its quantum 


mechanical behavior (refer to section 3.3.4). 


The free energy of the classical ideal gas, obtained in accordance with 


formula (2-68), works out to 


- 
P= —NkpT [In = t 5! (3-30) 


The entropy S is then obtained from the second relation in (2-69) as 
3, 2rmkpl 3 1... 
n + 


+ — | +—ln 


N' 2 h2 2'N wt! iso) 


S = Nkp[In 


while the internal energy, evaluated by working out the relevant Gaus- 
sian integrals in (2-67a) or (2-67b) (with the Hamiltonian substituted 


from (3-14)), is seen to be 


U = SNheT. (3-32) 


Note that this result for the mean energy could have been obtained straight- 
away by invoking the principle of equipartition, since the Hamiltonian (3-14) 
is made up of 3N number of positive quadratic terms of the type mentioned 


in sec. 2.2.3. 
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One finds that, for any given finite value of N, the terms involving + In nt 
in the above expressions for / and S prevent these from being extensive 
variables, though U turns out to be explicitly so. However, as observed 
several times in earlier paragraphs, the formula F = U—T'S, analogous to 
the thermodynamic relation between F,U,S is satisfied, even for finite NV. 
Additionally, the first relation in (2-69) shows that the ideal gas equation 


of state pV = NkgT is also satisfied. 


In the thermodynamic limit, V! can be replaced with its Stirling approxi- 
mation N In N —N, leading to the following expressions for the free energy 


and the entropy where both appear as extensive state functions: 


Z T 
(large N :) F =— NkpT|In : n ure 


h2 1] 


2|~S 


) +], (3-33a) 


(large N:) S= Nkp|[In 5 In (3-33b) 


V 
N 
and where the last expression agrees with (3-23) in view of (3-32). This 
is in accordance with the statement that in the large N limit, the canon- 
ical and the microcanonical ensembles turn out to be equivalent to each 
other. 


3.2.2.2 The grand canonical ensemble description 


The grand canonical ensemble gives a complete description of the behav- 


ior of a classical ideal gas in terms of parameters V,7,, from which one 
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can derive expressions for the grand potential (Q), the entropy (5), the 
pressure (p), the mean energy (U, interpreted as the internal energy in 
the thermodynamic limit), and the mean particle number (WV, related to 
the mole number v as NV’ = vA, where A stands for the Avogadro num- 
ber). These functions satisfy relations analogous to thermodynamic ones 
even for a ‘small’ system (i.e., one away from the thermodynamic limit), 
for which 9,5,U, and N do not satisfy the extensivity requirement (note 
that, for the grand canonical ensemble, the thermodynamic limit is de- 


fined in terms of V, with fixed T, ,). 


The grand partition function is given by the expression (refer to formu- 


las (2-96) and (3-28)) 


St ae (3-34a) 


where 


(3-34b) 


and + stands for the fugacity 


i. ehh (3-34c) 
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This gives the grand potential, worked out from (2-95a), 


. 
Q= —6 "ny = -8 1, (3-35) 
T 


from which one obtains the other relevant thermodynamic state functions 


by differentiation. One thereby obtains 


N — (aa )ver = a (3-36a) 

_ —On _ a4 _ NkpT 
Der Nay) im 3g =o ee 

O(6Q) 3 
U = (“3g va = aN kel, (3-36c) 
and, finally, 
On 5 V3 2QarmkpT 

S = —(s5) vn = Niels I In I 5 In h2 |; (3-36d) 


(check these results out; in particular, note the first equality in (8-36c)). It 
is apparent that, in this simple and special case of the classical ideal gas, 
Q,N,U and S are all extensive variables (i.e., ones proportional to V for 
fixed T, 1, and also for the assumed fixed values of the non-thermodynamic 
variables ¢, refer to sec. 2.1.3) even before the thermodynamic limit is 


taken, and the thermodynamic relation 


Q20=Ts 4 (3-37) 
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is satisfied. All these relations agree with those derived in sec. 3.2.2.1 in 
the thermodynamic limit (V — oo) when one can make the replacement 
N -—+ N, where N stands for the number of particles making up the gas 


in the canonical description. 


As a point of interest, note that the internal energy of a given quantity of 
the classical ideal gas depends on the temperature but is independent of 
volume, and the specific heat, for any given value of VV (or of the number 


of moles) has the constant value 


Cy = “ko, (3-38) 


(refer back to (3-32)). 


All these results for the classical ideal gas get modified for real gases on 
two counts: (a) the interaction, however weak, between the particles con- 
stituting the gas, and (b) quantum mechanical effects. In the remaining 
pages of the present chapter we will get to know the quantum effects on 
the behavior of the ideal gas, where the parameter _ (= y for the classi- 
cal ideal gas; see formula 3-36a) will be seen to appear as a characteristic 
of considerable significance, relatively large values of which will be found 
to correspond to quite notable departure from the classical behavior. The 
next chapter will be principally devoted to the classical and quantum de- 
scriptions of fluids with non-zero interactions among the particles, where 
we will begin with the theory of dilute gases in which the interactions 


have only a feeble effect on the behavior of the gas. 
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3.3 The ideal gas: quantum description 


A system made up of identical non-interacting particles confined within 
a given volume V constitutes an ‘ideal gas’, where the number of parti- 
cles N may or may not be a well defined parameter, depending on the 
mode of description of the system, i.e., the constraints defining its behav- 
ior. A complete quantum mechanical description of equilibrium states of 
such a system is conveniently obtained by means of the grand canonical 
ensemble, where the number of particles is not precisely defined in an 
equilibrium state since the constraints defining the state allow for parti- 
cles entering into or leaving the volume V in which the system is confined 
(say, through a semi-permeable boundary or through an open surface). 
The thermodynamic variables for such a state, or for a family of states, 
are all obtained from the grand canonical partition function Z, which is 
given by the formula (2-39b) in the quantum description and by (2-94c) 
in the classical description. We have already seen how the latter leads 
us to the thermodynamics of a classical ideal gas, in sec. 3.2.2.2. It now 
remains to see how a more complete description is obtained by means of 


the former and to relate the classical description to the quantum one. 


The grand canonical partition function involves a sum over all eigenstates 
of the form |F;(N)) of the system, corresponding to all possible particle 
numbers and, for each specified value (NV (= 0,1,2,---)) of the particle 
number, all possible energy values F;(N) (i =0,1,2,---) (the lowest energy 
level for any given value of N is commonly labeled with the index 0). The 
actual working out of the energy eigenvalues and the sum over states is 


an almost impossible task for interacting systems, but becomes simpler 
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(up to a point) for composite systems made up of independent parts, of 


which the ideal gas constitutes an instance. 


3.3.1 Systems of indistinguishable particles 


For the present, we consider a composite system made of a number 
(say, N) of indistinguishable particles, the latter constituting the rele- 
vant subsystems. For such a system, however, distinguishing labels can- 
not be attached to the subsystems, and the product vector space, say, 
VO @ VY) @---@ VY) is not the state space of the system under consider- 
ation because of the distinguishing labels 1,2,--- , N defining it. If, on the 
other hand, one considers the subspace made up of symmetric vectors or 
the one of antisymmetric vectors in the product space, then that subspace 
appropriately describes the states of the system under consideration — 
the symmetric subspace for a system of bosons, and the antisymmetric 
subspace for a system of fermions (refer back to sec. 1.3.3.2). For these 
two subspaces the distinguishing labels of the factor spaces lose their 
significance (though these are still of notional relevance in defining the 


product space). 


In the case of an interacting system of identical particles, it is no easy 
matter to find its energy eigenstates in the symmetric or antisymmetric 
subspace (depending on whether the particles are bosons or fermions) of 
the product vector space, though it is the set of such eigenstates that as- 
sumes relevance for describing the behavior of the composite system. For 
a non-interacting system, on the other hand, one needs to consider only 


the symmetric or antisymmetric combinations of the product eigenvectors, 
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which constitute the basic states from which the equilibrium ensembles 
are made up. Assuming that the relevant system variables are z1,--- , zy 
(in the case of a system with a classical counterpart such as an ideal gas 


z, stands for a pair r;, p; for i = 1,2,--- , N), the Hamiltonian is of the form 


H= 3 HO (z,), (3-39) 

i=1 
where H operates in V (i = 1,2,--- , N) (recall that all the V are copies 
of the same space VY) and where, as mentioned in sec. 3.1 above, the for- 
mula (3-39) is a convenient abbreviation, since a operates in the (sym- 
metrized or antisymmetrized) product space. The fact that the particles 
making up the system are indistinguishable, implies that H is the same 


function of z; as H) is of z; (i,j = 1,2,--- ,.N). 


For such a system of non-interacting indistinguishable particles, we first 
consider the product space V™ @ VY?) @--- @ V™), where the product is 
taken in some particular order that can be notionally interpreted as refer- 
ring to the individual particles imagined to be marked with the labels ‘1’, 
‘2’,---,°N’. As already noted, all the factors in the above product are copies 
of one another (i.e., of a space that we denote by the common symbol V), 


the energy eigenvectors of each being, say, 


E.), corresponding to eigen- 
values E,,, where the index a belongs to a set {a} that can be identified 
with the set of non-negative integers in the case of a constituent particle 
of the ideal gas. An orthonormal basis in the product space is then made 
up of products of eigenvectors of the form |E“),,) @|E@,,) @---@|E™),,), 


where the indices aj, a2,--- ,a, belong to the set {a}, while the super- 
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indices ‘(i)’ (¢ = 1,2,--- N) refer to the notional labeling mentioned above. 
The super-indices, however, are not relevant in respect of the eigenvalue 
for the eigenvector |) ® |B) @---@ EO», which is Ey, + Fa, +++: + Eay- 
In other words, any permutation of the super-indices produces a vector 
in the product space, belonging to one and the same eigenvalue. Among 
the permutations, some are even ones, while some others are odd, de- 
pending on the number of pair-wise exchanges of the indices necessary 


to generate the permutation. 


We now consider the sum of all possible products (for any specified set 
of indices aj,--- , ay) of the form )°, |W) @ |Be. )@-:-@ JBM», where 5°, 
denotes a summation over all the permutations (even and odd) of the or- 
dered set of super-indices {(1), (2),--- ,(V)} , a typical permutation being 
denoted by the symbol P. Each of the terms in the sum is an eigenvec- 


tor corresponding to the eigenvalue E,, + E,, +-::+ £a,, and hence the 


Cae 
sum itself is a symmetric eigenvector belonging to the same eigenvalue. 
The symmetric subspace of the product space V @ V®) @--- @ VO) is 
then made up of all linear combinations of basis vectors of the above 
form, for all possible choices of the set {a1,a2,--- ,a,} (without regard to 
order), whose elements belong to the index set {a}. The antisymmetric 
subspace, on the other hand, are made up of the eigenvectors of the form 
tae Cp| ES) ® | BO) Q-+-® |B), where ¢p has the value +1 or —1 depend- 
ing on whether permutation P is even or odd (this is referred to as the 


signature of P). 


If the independent energy eigenvectors for a single particle constitute a finite 
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set, say, |Eo),|E1),---|Ex), then {a} is the set of integers {0,1,2,--- , A} (note 
that the index a labels the independent energy eigenvectors rather eigen- 
values, since the latter may be degenerate). For particles confined to a box, 
kK — oo while one may also consider, for the sake of generality, systems with 


a finite dimensional space V of single-particle states. 


In the present context of a system made up of non-interacting particles, 
the above statements become more transparent when referred to sym- 
bolic representations of the basic symmetric and antisymmetric states 
in terms of particles in ‘boxes’, where the boxes stand for single-particle 
states and the particles do not have distinguishing labels attached to 
them. Referring to the vector space Y, which all the factor spaces V (i = 
1,2,--- , N) are copies of, the eigenstates |F,) (a = 0,1,2,---) are the single- 
particle states of interest. A symmetric or antisymmetric \-particle state 
of the form )°, Cp| Eo) ® |ES) Q---@ |B) (recall that in the case of a 
symmetric state all the ¢p are taken to be +1, while for an antisymmetric 
state ¢p is the signature of the permutation P) is then symbolically repre- 
sented by the boxes corresponding to the single-particle states (imagine 
all these boxes corresponding to a = 0,1,2,--- to be placed side by side), 
of which some are occupied while the others are unoccupied, the occu- 
pied boxes being the ones with labels aj,--- ,a,. The boxes do not have 
particle labels ‘1’, ‘2’, ‘3’, --., ‘N’ attached to them because the particles 
are indistinguishable (which is why only the symmetric or antisymmetric 
states are marked out in the product space) and the single-particle states 


are the same for all of them. 
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Among the indices (i.e., the labels of the boxes) a,,---,a,, there may 
possibly be one or more repeated ones. If an index (say, a) occurs m 
times, then there will be m number of particles in the box with label a 
(i.e., in each term in the sum 5°, Cp| EM) @ | EQ) @---® JES), there will be 
exactly m number of super-indices for which the lower index is a), and we 
will say that the box a has an occupation number m. In this manner, the 
state of the composite system in question is represented symbolically by 
a number (say, /) of occupied boxes, each box with its own occupation 


number m. 


For a system of fermions, however, there cannot be any repeated index 
among the set aj,--- ,ay since, for any repeated index, the sum over per- 
mutations, with the appropriate signatures taken into account, yields 
a zero vector (check this out; recall that the super-indices are notional 
labels for the particles, which are made irrelevant by means of the per- 
mutations). In other words, no single-particle state can be occupied by 
more than one particles in a system of identical fermions (Pauli’s exclusion 
principle). In the case of a system of identical bosons, on the other hand, 
a box representing a single-particle state can be occupied by any number 


of particles. 


In other words, for a system of N number of indistinguishable fermions, 
the number of occupied boxes is exactly N. Incidentally, it is important 
to note that the description of basic states in terms of occupied and un- 
occupied boxes holds for a system of interacting particles as well, and that 


the Pauli principle applies to fermions regardless of possible interactions 
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among those. The box description is especially useful for a non-interacting 
system since it conveniently describes the energy eigenstates and eigenval- 


ues. 


It is thus important to distinguish between the single-particle states, rep- 
resented by the individual boxes, from the system-states represented by 
a sets of occupied boxes for the system made up of N number of indis- 
tinguishable particles. In the grand canonical description of an ideal gas, 
the number of particles is not specified; for a system-state in which the 
number of particles is N (a variable number), WM (< N) number of these 
boxes are occupied. Referring to the imagined infinite set of boxes, each 
corresponding to a single-particle state with some non-negative integer 
value of the index a, a specification of their respective occupation num- 
bers (m, which can be any non-negative integer for bosons, but can only 
be O or 1 for fermions), completely specifies a system-state. In describing 
a system-state one often refers to only the occupied boxes (there remain, 
however, an infinite number of unoccupied ones for any finite value of N) 


and their respective occupation numbers. 


For instance, fig. 3-1(A) below represents an energy eigenstate for a sys- 
tem of three bosons with two occupied boxes (one with occupation num- 
ber two and another with occupation number one), the boxes correspond- 
ing to the other single-particle states being vacant. Fig. 3-1(B), represents 
a state of fermions with three occupied single-particle states (each with 
occupation number 1) with, again, the remaining single-particle states 


vacant (occupation number 0). The small dots in (A) and (B) represent 
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the unoccupied boxes from among the infinite number of single-particle 


states. 
(A) 
(B) 


Figure 3-1: Illustrating the description of system-states for a system made up of 
indistinguishable particles in terms of single-particle states represented by boxes; (A) 
represents an energy eigenstate for a system of three bosons with two occupied boxes 
(one with occupation number 2 and one ther with occupation number 1), the boxes 
corresponding to the other single-particle states being vacant; (B) represents a state of a 
system of three fermions with three occupied single-particle states (each with occupation 
number 1; a larger occupation number is not possible for fermions) with, again, the 
remaining single-particle states vacant (occupation number 0); the small dots in (A) and 
(B) represent the unoccupied boxes from among the infinite number of single-particle 
states. 


As mentioned above, we will need only the energy eigenstates indicated 
above (described by symmetric or antisymmetric product vectors and rep- 
resented by sets of boxes with their respective occupation numbers) in 
order to build up the relevant ensembles for systems of non-interacting 
identical particles (recall that these ensembles correspond to density ma- 
trices diagonal in the energy representation), while superpositions of these 


basic states are needed in the case of interacting systems. 


The occupation number representation will be explained in greater 


details in section 7.2 
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3.3.2 BE, FD, and MB statistics 


A system of identical bosons for which the basic states are constructed 
and represented as above, is said to obey the Bose-Einstein (BE) statistics 
while, similarly, a system of identical fermions is said to obey the Fermi- 


Dirac (FD) statistics. 


In the case of an interacting system, the basic principle distinguishing be- 
tween the BE and FD statistics remain the same, and what differs is the 
working out of the basic energy eigenstates in terms of which the relevant 
ensembles are constructed. For a non-interacting system one needs to con- 
sider only the symmetric or antisymmetric (as the case may be) combina- 
tions of the product eigenvectors constructed from single-particle energy 
eigenstates while, in the case of an interacting system, one has to obtain 
the relevant system eigenstates by an appropriate diagonalization of the to- 


tal Hamiltonian in the entire symmetric or antisymmetric subspace. 


For a system of N identical non-interacting bosons or fermions, an en- 
semble is constructed as follows: we consider all the single-particle states 
represented by boxes (infinite in number), and then all possible distribu- 
tions of N particles (without labels attached to them) among the boxes 
subject to the condition that, for a system of fermions, there cannot be 
more than one particles in any of the boxes. Each such distribution repre- 
sents a basic (pure) state of the system, with some weight associated with 
it, which need not be normalized for our present considerations. Ther- 
modynamic quantities are then obtained in terms of weighted sums over 


these basic states where some physical quantity depending on the state 
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concerned (such as, the energy of the system) is summed over, subject to 
the respective weights of the states. For instance, if w; is the weight of a 
state and f; the value of the relevant physical quantity in the state, then 
one would like to evaluate the sum )°, w;f;, where the index i refers to the 


basic system states involved in the sum. 


As mentioned above, one has to distinguish here between the states of the 
many particle system under consideration (ones we will, when necessary, 
refer to as the ‘system states’) and the single-particle states in terms of 


which the basic system states are described. 


As a hypothetical but simple example explaining the above statements, 
fig. 3-2(A) depicts the states of a system of two bosons, each correspond- 
ing to a distribution of the two bosons in two single-particle states. We 
find that there are three possible system states (marked ‘a’ to ‘c’ in the 
figure) for which the weights are, say, w,, w2, w3 respectively. The required 


sum is then given by wifi + waf2 + w3fs. 


Fig. 3-2(B) depicts the situation for two fermions in two boxes (single- 
particle states) where now one finds that only one system state is allowed 
in accordance with the rule that more than one identical fermions cannot 
be accommodated in the same single-particle state, which rules out the 
distributions marked ‘b’ and ‘c’ in fig. 3-2(A). The required sum is then, 
simply, w,f; corresponding to the first term in the expression for the BE 
system. The two cases (BE and FD) differ precisely in those distributions 
where there are more than one particles in at least one box in the BE 


Case. 
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At times, an approximate value of the the required sum can be obtained 
by following a different route, corresponding to what is referred to as the 
‘Maxwell-Boltzmann’ (MB) statistics. Here one starts by attaching labels 
(‘1’,‘2’,---,°N’) to the particles and distributing these numbered particles in 
all possible ways among the boxes representing the single-particle states. 
In the case of two particles and two boxes, this produces four distribu- 
tions shown in fig. 3-2(C), marked ‘a’ to ‘d’, where there is one particle in 
each box in distributions ‘a’ and ‘b’ while, in each of the distributions ‘c’ 
and ‘d’ there are two particles in one of the boxes and none in the other 
box. Note the numbers (‘1’,‘2’) now attached to the particles (represented 
by circles) in these distributions: if the numbers were wiped off, then the 
first two distributions would each be identical with the first distribution 
in the BE case (and also to the only allowed distribution in the FD case), 
while the other two (marked ‘c’,‘d’) would reduce to the remaining two 
distributions in the BE case (respectively, distributions marked ‘b’,‘c’ in 


fig. 3-2(A)). 


Put differently, fig. 3-2(C) involves too many distributions compared to 
either the BE or the FD distributions, and one attempts to reproduce the 
quantum distributions by introducing a correction factor as follows. One 
writes down the relevant sum by referring to the uncorrected distribu- 
tions which, in the present instance of fig. 3-2(C), gives the expression 
2w, fi + wefe + w3f3, where the factor of two in the first term arises from 


the two distributions marked ‘a’,‘b’ in the figure. One then corrects by 


1 


an overall factor of + (4 


in the general case of a system of N particles) 


by way of identifying distributions obtained by permutations among the 
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labeled particles, thereby arriving at the expression wf; + 4w2f2 + 4us3fs. 
This, however, does not have the same effect as that of removing the la- 
bels from the very beginning since now the second and third terms in this 
expression differ from the BE case (where both appear with coefficient 1 
rather than 5) as also from the FD case (where these terms are absent, 
which can be interpreted as terms with coefficients zero). In other words, 
the expression for the weighted sum under consideration arrived at by 
following this procedure over-corrects when compared to the BE case and 


under-corrects when compared to the FD case. 


I EM Os 


(A) 


Lee 


©e|L_ | L_Ieol 
a b c d 


(C) 


Figure 3-2: Comparing the methods of the working out of weighted state sums (of 
the form 5°; w;f;) in (A) BE, (B) FD and (C) MB statistics; possible distributions of two 
particles (represented by dots in (A) and (B), and by numbered circles in (C)) in two 
single-particle states are shown; in (A), one has three possible distributions marked 
‘a’, ‘b’, and ‘c’, while in (B), only one distribution is possible due to Pauli’s exclusion 
principle; in (C) four distributions are possible, marked ‘a’, ‘b’, ‘c’, and ‘d’; the particle 
labels (‘1’ and ‘2’ in the labeled circles) are actually not relevant in the state sums, which 
means that the distributions marked ‘a’ and ‘b’ both give the same contribution as the 
distribution ‘a’ in (A), while ‘c’ and ‘d’ give the same contributions as, respectively, ‘b’ 
and ‘c’ in (A); in addition, a factor of - is to be brought in (a for N particles), which 
makes the state sum in (C) different from that in (A) or (B); the three statistics differ 
precisely for those distributions in which at least one of the two boxes accommodates 
more than one particles, the other being unoccupied. 
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In other words, the MB statistics for a system of N identical particles cor- 
responds to the approach in which all possible distributions are consid- 
ered with the particles assigned distinguishing labels and then correcting 
the resulting expression (for a sum over system states with appropriate 
weights) by introducing an over-all factor of +. Looking back at sec- 
tions 1.3.2.2 and 1.3.2.4, where the classical description of states for a 
system made up of N identical particles in the phase space was outlined, 
we recall that a similar division by N! was introduced in the phase space 
integrals there so as to account for the indistinguishability of the particles 
and, moreover, there was a factor of ;4, introduced so as to account for 
the fact that, owing to the finiteness of the Planck constant, pure states 


of the system could not be assumed to be continuously distributed. 


These two factors were introduced in the classical context in an ad hoc 
manner, independently of each other. The ‘MB statistics’ introduced in 
the present section is essentially a re-statement of the classical approach 
in that the same = is used to account for the indistinguishability of the 
particles, and the discreteness in the distribution of pure states in the 
phase space partitioned into cells is accounted for by using discrete sums 
over relevant quantum mechanical states in the place of integrals; in 
the limit of large volume (V) of the system, the discrete sums reduce to 
integrals with the factor of 3 making its appearance. The MB statistics 
is therefore, essentially the same thing as the classical approach where 
the mixed states are represented in terms of phase space integrals, with 


the factor of i444 attached. 


In contrast to the MB statistics, the BE and the FD statistics are based 


181 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


on genuine quantum mechanical principles, providing us with correct 
descriptions of the behavior of systems of bosons and fermions respec- 
tively. Considerations in the above paragraphs tell us that, in calculating 
the relevant weighted sums over system states, the difference between 
the three statistics arises from those terms in the sums that correspond 
to system states in which the occupation numbers of one or more of the 
single-particle states are greater than 1 (i.e., M@ < N, where M stands for 


the number of occupied single-particle states; refer to sec. 3.3.1). 


This provides us with a criterion under which the MB statistics works out 
to be a good approximation to the correct quantum statistics (BE in the 
case of bosons and FD in the case of fermions): if the mean occupation 
number of single-particle states in the basic system-states involved in an 
ensemble is small compared to unity (refer to formula (3-49) below), then 
one can conclude that the pure states in the ensemble for which there are 
more than one particles in one or more single-particle states are relatively 
few in number; in other words, the BE and the FD statistics are expected 
to go over to the simpler MB statistics for small values of the mean occu- 
pation number (m << 1). This happens in the limit of the Planck constant 
going to zero, but the Planck constant is a dimensional constant of a nat- 
urally determined value, and it cannot be simply wished to go to zero. 
As we will see, there exist a number of dimensionless constants involving 
combinations of the Planck constant with other dimensional constants 
such that, for small values of these dimensionless constants, results of 
quantum statistical mechanics go over to the classical results arrived at 


on the basis of MB statistics. 
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Under conditions of small values of the mean occupation number, the 
relevant sums over the system states correspond to the phase space inte- 
grals of classical statistical mechanics (the sums go over to integrals for 
large values of the system volume, when the factor of 3, makes its ap- 
pearance). Indeed, there exists a formulation of quantum theory in terms 
the phase space, referred to as the Wigner formulation (more precisely, 
several such formulations are possible) which reduces to the phase space 
formulation of classical mechanics when the Planck constant goes to zero 
in comparison with other relevant quantities characterizing the dynamics 


of the system under consideration. 


3.3.3 The ideal gas in FD and BE statistics 
3.3.3.1 The description in terms of single-particle states 


In the quantum theory of the ideal gas, a major step is to reduce the 
relevant sums over system states (i.e., the states of the gas as a whole) to 
ones over single-particle states. The result in such a reduction depends 
on the chemical potential of the gas which, in turn, is a function of the 
temperature (7) and the density (i.e., the ratio “ in the thermodynamic 
limit), though the explicit determination of this functional dependence 
on T and x can be accomplished only in approximate terms in specific 


situations. 


The reduction from a summation over system states to one over single- 
particle states is achieved most conveniently for the grand canonical en- 
semble where the system under consideration is described in terms of 


V,T, as the basic variables. 
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The restriction to a fixed value of N for the microcanonical and the canonical 
ensembles makes it inconvenient to make use of these in arriving at mean- 
ingful results; results for the the ideal gas in the microcanonical ensemble 


will be outlined in sec. 3.3.5. 


We start from the expression for the grand canonical partition function 


that I write once again for ready reference 


Z_= >> So expl-B(Ei(N) - aN) (3-40) 


N=0 1 


where £;(V) (i = 0,1,2,---;N =0,1,2,---) stands for the energy eigenvalue 
of the gas as a whole with a specified number of particles (NV), corre- 
sponding to a system-state labeled with the index i, — we call these the 
N-particle system energies, distinguishing these from the single-particle 
energies ¢, (a = 0,1,2,---) where it is to be noted that the index a refers 
to single-particle states. Since the single-particle energy levels may be 
degenerate, the energies «, may be identical for more than one distinct 


values of the state index a. 


The energy scale is chosen in such a way that all the single-particle en- 
ergies are non-negative, which implies that all system-energies are non- 
negative as well. The values a = 0 and i = 0 are assigned to the ground 
states for the single particle and the system respectively. In the expres- 
sion (3-40), the sum over the particle number N is assumed to run from 


0 to oo. 


This is equivalent to setting the scale of energy such that the vacuum cor- 
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responds to zero value of energy. In subsequent paragraphs we will also 
assume that the single-particle ground state is of energy zero, which seems 
to be a contradiction. However, the assumption ¢9 = 0 only means that all 
the other energies are to be considered relative to the single-particle ground 
state energy. For instance the occurrence of the single-particle chemical 
potential of the system () in an expression is to be interpreted as py — €, 
while €, (a = 1,2,---) is to be interpreted as €, — €9. Considered all by itself, 


€) cannot be arbitrarily assumed to be zero. 


An ideal gas being a system of non-interacting particles, any chosen sys- 
tem state |E;(N)) with energy E;() can be described, as explained in 
sec. 3.3.2, by means of a set of occupation numbers m, for all the single- 
particle states |e.) (a = 0,1,2,---) where, in the case of fermions, m, can 
be either 0 or 1 while, for bosons, m, can have any non-negative integer 
value. The sequence {m,} then gives the particle number (N) and the 


energy (£;(V)) through the relations 


N= Some, Ei(N) = 55 meea- (3-41) 


The index : labeling the system-state is actually determined by the sequence 


{ma}. 


Such a sequence of occupation numbers defined for all the single-particle 
states will be referred to as a ‘distribution’ (recall our use of the term in 
this sense in sec. 3.3.2). The sum over system-states for all possible 


particle numbers in (3-40) can then be worked out as a sum over all 
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possible distributions which, in turn, can be expressed as 


ig = I] Y= exp[—Bma(€a — p)], (3-42) 


a ma 


where the product over a involves all the single-particle states and, for 
any given a, the sum over m, ranges from 0 to 1 for fermions, and from 0 
to co for bosons. This way of writing the grand canonical partition func- 
tion incorporates the quantum mechanical principle that system-states 
for fermions have to be antisymmetric and those for bosons have to be 
symmetric for a system of indistinguishable particles (refer to sec. 3.3.1; 
it is straightforward to see that the expression (3-42) reproduces the sum 


over all possible distributions resulting from the sum in (3-40)). 


Later in this chapter, in sec. 3.3.5, the term ‘distribution’ will be used in 
a different sense, namely, to denote a sequence of particle numbers in the 


various single-particle energy levels. 


For any given single-particle state corresponding to a specified value of 
the index a, it will be necessary to refer to a sum over all the possible 


occupation numbers m, of the form 


za(B, 4) = S~exp[—Brma(ea — 1] (3-43) 


Ma 


which we refer to as the single-particle partition sum for the given single- 
particle state with index a and for given values of the chemical potential 
y and the temperature T(= (kpG)~'). Note, however, that even as the 


expression in (3-43) refers to some particular single-particle state, the 
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temperature (7) and the chemical potential per particle (:) correspond to 
an equilibrium state of the system under consideration described by the 
grand canonical ensemble. According to (3-42), the grand partition func- 
tion Z, appears as the product of single-particle partition sums over the 
single-particle states owing to the fact that the system under considera- 


tion is made up of non-interacting particles: 


LZ, = I] Zn: (3-44) 


a 


One can now go back to the expression (2-39a) for the probability of a 
system state with N particles and with system energy £;(V), which can 
alternatively interpreted as the probability of a distribution over all single- 
particle states |e.) with occupation numbers m, for some particular se- 
quence {m,} determined by the system state under consideration so that 
we have, under this interpretation, w; > p({m.}). This expression can be 


written in the form 


p({ma}) = IT. exp[—B ma (Eq = 3) 


Il. Za(B, 1) 
_ Il Di Prtal 7 H)) (3-45) 


a 


This is seen to be the product of factors p(m,), 


p({ma}) = [| pla), (3-46a) 
p(ma) = PL Prtal = L)| : (3-46b) 
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where p(m,) can be interpreted as the probability that the single-particle 
state corresponding to index a has an occupation number m,. The fact 
that the probability of a system-state, corresponding to a distribution 
{ma}, can be expressed as the product of independent probabilities p(m,) 
for the individual single-particle states is a consequence of the fact that 
the gas is made up of non-interacting identical particles and of our choice 
of the grand canonical ensemble in describing the system behavior, where 


the particle number J is not fixed but is summed over all possible values. 


The mean occupation number of a single-particle state |a) is given by 


Me = Mgp( Ms) = abe In 24, (3-47) 


where z, = 2.((, u) is given by (3-43) (check this statement out). 


The expression for z, can be worked out in the FD and BE cases by 
making use of the rule of occupation of the single-particle states: m, = 0,1 


for fermions, and m, = 0,1,2,--- for bosons. One obtains 


1 
[FD statistics] z. = y exp|—Bma(eéa — L)] 


Moa=0 


= 1+ ¢AleaH) (3-48a) 


and 


co 


[BE statistics] z. = sy exp|—Bma(€q — L)] 


Ma=O0 
1 


= 1 — e-Blea—B) = e—B(ea—H)’ (3-48b) 
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Substituting in (3-47), one obtains the formula for the mean occupation 


number of any given single-particle state: 


1 


eb(ea-H) + 1’ (3-49) 


[FD and BE statistics] mq = 


where the upper sign in the denominator on the right hand side is for a 


FD gas while the lower sign is for a BE gas. 


The thermodynamic functions such as the entropy, pressure, and the 
mean particle number in a state characterized by the parameters V, £, u 
are all obtained from the grand potential 0 = —6~'ln Z, which, because 
of (3-44), can be expressed in terms of the single-particle partition sums, 


with reference to which the quantity of interest is 
[FD and BE statistics] Inz. = +In(1+ e Plea“ H)) (3-50) 


where, once again, the upper signs in the right hand side are for the FD 
gas while the lower signs refer to the BE gas. The resulting expression 


for the grand potential is 


[FD and BE statistics] Q = -67! oP In(1 + e~Fea-#)), (3-51) 


The expressions for the mean particle number, mean energy (identified 


with the internal energy), pressure, and entropy work out to 


- i 
N= So Ma = S° ee (3-52a) 
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= ) Eqanag = eBlea—H) + J” (3-52b) 

AQ 0 

See pa ; In(1 + e7Blea-#) -52 
OQ Eq — fb 
ae = —Blea-H)) 4 a = 
S= ap = ke [+mn(1te ASSP ae ney (3-52d) 


a 


As mentioned earlier, the above expressions for N’,U,p, and S are valid, in 
a formal sense, even for a finite system. Though, in the case of a finite 
system, \V,U, and S are not extensive variables (i.e., ones proportional to 
V), yet these satisfy the thermodynamic relation 2 = U —TS — wN. However, 
the formula U = TS — pV + uN is not satisfied by these formally defined 
quantities, as a result of which the relation (1). = —pV between the grand 


potential and the pressure is satisfied only in the thermodynamic limit. 


In this context, recall that one can, in an analogous manner, formally de- 
fine the quantities U,p for a family of states (with variable values of V,7T) of 
a finite system described by means of the canonical ensemble, for which 
the number of particles N is held at a fixed integer value. Thus, strictly 
speaking, yu is not defined in the canonical formalism. One can effectively 
assume yp = 0 for a system with a fixed particle number. In the canonical 
formalism, then, y is defined only in the thermodynamic limit, when the 


relation U = TS — pV + uN is recovered. 


Incidentally, the symbol ‘p’, when used to denote the pressure, need not be 
confused with the same symbol being employed to denote the occupancy 


probability p(m,,) or the probability p({m,}) associated with a distribution 
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{ma}. 


3.3.3.2 The thermodynamic limit: continuously distributed single- 


particle states 


In the thermodynamic limit (V — oo), the distribution of the single-particle 
states becomes effectively continuous, in which case a summation over 
the single-particle states can be replaced with an integration. Since, in 
the thermodynamic limit, one can ignore the effect of the shape of the 
surface in determining the system behavior, we assume, for the sake of 
simplicity, that the gas is enclosed within a region of cubical shape, in 
which case the index a identifying a single-particle state stands for a set 
of three integer quantum numbers, say, {n1,72,73} (in addition to quan- 
tum number m, described below), and the corresponding single-particle 


energy is given by 


(nt tnzt+n3) (m,ne,n3 = 1,2,---). (3-53) 


Recall the quantum mechanics of a particles enclosed in a cubical box of 
volume V, with the boundary condition that the wave function is to vanish 


on the boundary surface of the box. 


h (ny, Ng, N3) with (P1, P2, Ps) where 


In the limit V > ov, one can replace aa 


P1, P2,p3 can be treated effectively as continuous variables that now char- 


acterize the single-particle states. The single-particle energy «, is now 
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replaced with 
1 
€a > Epipaps = 5 (Pi + P2 + P3), (3-54) 


where each of the variables p,, p2,p3; can take up values ranging from 0 to 
oo. The sum over the index a, of some function, say, ¢(¢,), then goes over 


to the integral 


Yolen) + fd [aoe [amo PP), (8-55) 


(check this out; recall the definition of an integral as the limit of a sum), 
where the factor g is explained as follows. The quantum mechanical de- 
scription of the state of a particle of the gas has to include its spin state, 
in addition to the orbital quantum numbers nj, n2,n3 appearing in (3-53). 
The spin characterizes an internal state of the particle, determined by a 
quantum number m, that can take up values ranging from —s to s, with 
successive values differing by unity, where s denotes the spin quantum 
number that has a fixed value (either a half-integer or an integer, corre- 
sponding to a fermion or a boson) for each particle. Though the state of 
a free particle depends on the quantum number m,, its energy does not, 
thereby causing the energy to be the same for 2s + 1 number of states dif- 
fering in their internal quantum number m,. In other words, the energy 
given by (3-53) or by (3-54) is g-fold degenerate where the degeneracy 
number g equals 2s + 1. A sum over states given by the first expression 
in (3-55) is therefore g times a sum over energies, where the latter are 


labeled with pj, p2,p3 as in (3-54) 


192 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


With the integrand in (8-55) depending only on the squares of the vari- 


ables pj, p2, p3, the above can be further reduced to 


Vg [* - = pitpst+pi, 4nVg [~ p, 5 
Sten) > FS [aon [dow [dono P) — AEP oF pra, 


Qa 


(3-56) 


where the last expression is obtained by introducing polar co-ordinates 
in the p,-p2-p3 space, and defining the variable p as p = \/p? + p3 + p3 (this 
variable, p, is not to be confused with the pressure of the gas, denoted 
by the same symbol as per convention). In the following, we transform 
back from the variable p to the energy « (now a continuous variable cor- 


responding to the single-particle energy) according to « = e 


m 


With all these replacements and definitions, one obtains the following 
expressions in the thermodynamic limit (V — oo, yu, 7 fixed) for the grand 
potential, the mean particle number, the free energy, the entropy, and 
the pressure for the ideal FD and BE gases at temperature T = (kp3)7! 


and with chemical potential ju: 


(FD and BE ideal gas — thermodynamic limit :) 


Q= F2ngksTV (Se)! i e? In(1 + ye~**)de, 
0 
N= ee, eal [ eee 
h? o yYle’e +1” 
= ren ala is e 
h? o yoe +1 
5= ye gala [ [ + In(1 + ye") + B ae a Jezde, 
h? 0 alele a | 
2s fa = 
p = +2rgkgT(—,)? : e2 In(1 + ye~*)de. (3-57) 
0 
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where 7 = e*“ stands for the fugacity. In these expressions the upper 
signs refer to the FD gas and the lower ones to the BE gas. As men- 
tioned above, the factor g stands for the degeneracy parameter, having 
the value (2s + 1) for a particle with spin quantum number s. In the case 
of electrons, for instance, one has s = 5, and g = 2. One observes that 
Q,N,U, and S appear as extensive state functions (as these should, in 


the thermodynamic limit), and the pressure p as an intensive one. 


It is, however, important to qualify the above results by making the ob- 
servation that one needs to use modified formulae in the case of a strongly 
degenerate BE gas below and close to the BE condensation temperature 
(sec. 3.3.7.2). In the case of BE condensation, the ground state is pop- 
ulated to an anomalous extent, which one has to take into account in 


deriving the thermodynamic parameters of the system. 


With this background, we will quickly go through a number of basic re- 
sults relating to the FD and BE ideal gases in the sections 3.3.6, and sec. 3.3.7, 
after outlining, in sec. 3.3.4 below, how the classical description of the 
ideal gas is obtained in a limiting sense from the more general quantum 
mechanical description. Section 3.3.5 will be in the nature of a digres- 


sion. 


3.3.4 The ideal gas in the classical limit: MB statistics 


As outlined in sec. 3.3.2, one expects the results of FD and BE statistics 
to reduce to those of MB statistics for an ideal gas when the mean oc- 


cupation number (m,) of any given single-particle state is small. Looking 
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at formula (3-49), this is ensured if the mean occupation number of the 
ground state is small, i.e., e~°"~©) is large compared to unity, since the 
mean occupation number decreases with increasing energy of the single- 
particle state. Since in the limit V — ov, « itself goes to zero (check this 


out), we can, for our present purpose, assume that e~°(4~©) —; e~ 4, 


In other words, the condition that the quantum mechanical description 
(in either the FD or the BE statistics) reduces to the classical one is that 
the fugacity is to tend to zero (y = e*" << 1), in which case quantum 
mechanical symmetry requirements on the system states cease, to all 
intents and purposes, to be restrictive, and the MB approach in calculat- 
ing the relevant sums over the system states (which can be described as 
sums over all possible distributions of particles among the single-particle 
states) acquires validity. Under this limiting condition the mean occupa- 
tion number for the single-particle state with index a is given by (refer 


to (3-49)) 
Mo, = ePHe Pea, (3-58a) 


Thus, the probability for the said single-particle state is given by 


e Pea 


ia)=——, 2=) 6", (3-58b) 


z 


where z is referred to as the single-particle partition sum (though its con- 
notation differs from z, introduced in (3-43), also referred to by the same 
name). This expresses the Maxwell-Boltzmann energy distribution for- 


mula for the ideal gas, from which follows the Maxwell velocity distribu- 
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tion formula (2-75). 


As mentioned in sec. 2.2.2, the Maxwell-Boltzmann formula for a classical 
ideal gas can be obtained by looking at a single molecule and treating it as a 
system in weak interaction with the rest of the gas, the entire system being 
in equilibrium at a temperature, say, T. The molecule in question may be 
looked upon as a ‘small’ system in equilibrium at the temperature 7, with 
its energy fluctuating across various possible values due to the interaction 
with the rest of the gas, the latter playing the role of a heat bath. If, how- 
ever, the classical approximation does not work for the ideal gas, then the 
above line of reasoning fails and the Maxwell-Boltzmann energy distribution 


formula (3-58b) is no longer valid. 


As for the thermodynamic functions, the sums over the single-particle 
states reduce, in the thermodynamic limit, to integrals, as explained in 


sec. 3.3.3.2, and one expects to recover the classical results of sec. 3.2. 


This can be checked right away by looking at the limiting form of the first 
equality in (3-57) for y << 1, 


a 


[ideal gas in MB limit :] Q = ~gkaT Vy ae 


(3-59a) 
from which one gets the MB approximation to the grand partition function 


as 


27m 


ae 2). (3-59b) 


[ideal gas in MB limit :] Z, = exp [aVy( 
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This is consistent with (3-34a), (3-34b), based on classical considerations 
except for the factor g (the spin degeneracy), which needs to be com- 


mented upon at this point. 


Though it has come up in the context of the spin states of the parti- 
cles making up the system (an ideal gas) under consideration, and is 
essentially quantum mechanical in origin, it continues to remain in the 
classical limit (as in the expression (3-59a) above), reminding us that a 
number of features of quantum mechanical origin have to be necessarily 
taken into consideration in the classical approach in statistical mechan- 
ics for the explanation of thermodynamic behavior of systems. While 
the spin of a particle refers to internal states in contrast to orbital states 
(such as the ones characterized by the quantum numbers nj, n2,73; refer 
to formula (3-53)), one needs to consider, in numerous contexts, inter- 
nal states of more general description. For instance, in the case of a gas 
made up of neutral atoms, the internal states relate to electronic con- 
figurations of the atoms along with other ‘modes’ of excitation such as 
the rotational and vibrational modes in addition to possible nuclear exci- 
tations (refer back to sec. 3.1.1, in the context of which the degeneracy 
parameter g appearing in (3-59b) is nothing but the factor 2,o¢2vibZeZnuc 


introduced there). 


For a wide range of physical conditions, the nuclear excitations can be 
disregarded and, in a similar vein, one needs to consider only the elec- 
tronic ground states of the atoms, for which g = 1 since the total angular 
momentum of all the electrons in the atom add up to zero in the ground 


state, which is a unique state with the minimum possible energy associ- 
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ated with the internal configuration of the electrons. It is this value of the 
degeneracy parameter (g = 1) that was implicitly assumed in the classical 
phase space considerations of sec. 3.2.2, while the formula (3-59a) ex- 
plicitly accommodates the more general situation of degeneracy of single- 
particle states relating to internal states of the particles under consid- 
eration. In the case of a gas made of diatomic or polyatomic molecules, 
the internal states involve rotational and vibrational motions of the atoms, 
even when one disregards the electronic states other than the electronic 
ground state and also the nuclear excitations. With reference to results 
stated in sec. 3.1.1.1, it may be mentioned, however, that the nuclear 
excitations may still be relevant under certain circumstances and that 
the contribution to the rotational partition function z,,, to the degeneracy 


factor g may, under appropriate conditions, be obtained classically. 


Incidentally, the physical relevance of the condition y << 1 for the validity 
of the MB statistics and of the classical limit is made clear by referring to 
the second equality in formula (3-57) which approximates, in the classical 


limit, to 


7 
v= aS ’ (3-60a) 
sh 


where the same formula (without the factor g) is also obtained directly in 
the classical phase space approach (eq. (3-36a)). In other words, for a 


given value of the degeneracy parameter g the condition 7 << 1 translates 
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h Vivi 
Ar = a 3, 3-60b 
° J/2rmkpl WF ) ( ) 


where \;7, the thermal wavelength at temperature 7, stands for the de 
Broglie wavelength (up to a factor of 7) of a particle with a kinetic energy 
kpT (refere to sec. 3.2.2.1). It represents a quantum mechanical length 
scale associated with the wave function of a free particle of energy kpT, 
the typical energy scale implied by the principle of equipartition for a clas- 
sical system at temperature 7. The condition (3-60b) implies that, for the 
classical approximation to work, this quantum mechanical length scale 
is to be small compared to the average separation between particles in a 
system confined within a volume V for which the mean number of parti- 
cles is V.. For such a sparse distribution of the particles making up the 
system, there is little overlap between the wave functions of the particles 


and thus, only a small occupancy probability of the single-particle states. 


1. Incidentally, the way the classical results have been arrived at in the 
above paragraphs, it is apparent that the limiting transition from the 
quantum formulation to the classical one makes sense only in the 
thermodynamic limit. This puts a question mark on the relevance 
of the classical statistical mechanics for ‘small’ systems (refer to sec- 
tions 2.2.2 and 2.2.5), though the formal validity of thermodynamic 
formulas in the quantum mechanical canonical and grand canonical 


ensembles continues to hold (refer to sections 2.1.3, 2.1.5.2, and 2.1.6.2). 
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2. The temperature 
h? 2 
Tp — omka (3-61la) 


is referred to as the degeneracy temperature, where n = (%) stands 


for the number density of molecules in the system (at times, a numer- 
ical factor in the definition of Tp differs from that given above). The 


condition (3-60b) can then be expressed equivalently in the form 


T >> Tp. (3-6 1b) 


In summary, considerations relating to the ideal gas tell us that classical 
statistical mechanics does not constitute an independent approach but 
emerges as a limiting theory from the more complete quantum statistical 
mechanics, with its validity restricted to situations in which the thermal 
wavelength of the particles of a macroscopic system is small compared 
with the average separation between them. This limiting nature of the 
theory is in evidence in the presence of the Planck constant in its basic 


formulas, as seen in sec. 2.2. 


Conversely, results derived in classical statistical mechanics need to be 
modified when applied to situations beyond their limit of validity. For 
relatively small deviations from the condition of validity of the classical 
theory, the modifications are in the nature of small corrections while, 
in situations of a more general nature, the correct quantum mechanical 


results bear no relation to the classical theory. 


The ideal gas is a very special system in which the particles are non- 
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interacting. In this case the quantum mechanical corrections over the 
classical results mentioned above depend only on the single parameter 7, 
the fugacity. For more general systems with interactions among the par- 
ticles, the quantum corrections involve additional parameters depending 
on the Planck constant, the temperature, and on the range and strength 
of the interactions (for a brief outline, refer to section 4.2). However, 
such corrections are meaningful only for certain restricted classes of sit- 
uations. For still more general situations, the classical results cease to 
be of relevance and the more complete theory of quantum statistical me- 
chanics is to be invoked in explaining and describing the thermodynamic 


behavior of systems (refer, for instance, to chapter 7). 


The limiting transition from quantum statistical mechanics to classical 
statistical mechanics based on the phase space of a system is made more 
transparent by referring to the Wigner function in quantum theory where 
the latter is formulated within a phase space framework. A brief intro- 


duction to the Wigner function will be found in section 10.2.4. 


3.3.5 Digression: the ideal gas in the microcanonical 


ensemble 


In sections 3.3.3, 3.3.4, we have outlined the statistical mechanics of 
an ideal gas in the grand canonical ensemble since that is the setting 
in which the partition function and the thermodynamic variables can 
be worked out in the most direct manner. This is because the grand 
canonical ensemble is free of constraints on the energy and the number 


of particles in the gas. 
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We will now briefly outline the statistical mechanics of the ideal gas (a 
system of non-interacting indistinguishable particles) in the microcanoni- 
cal ensemble, where one obtains results equivalent to the ones derived in 
the sections mentioned above, in the thermodynamic limit. In the micro- 
canonical ensemble the energy (U) and the number of particles (V) have 
specified values, as does the volume (V) of the gas, We will base our con- 
siderations on quantum mechanical principles, where one distinguishes 
between the BE and FD statistics, though we will address the limiting 


case of the MB statistics as well, pertaining to the classical ideal gas. 


We begin by once again considering the single-particle states |«,). In the 
grand canonical context we focused on the distribution of the particles in 
these single-particle states, where states corresponding to different val- 
ues of the index a may correspond to the same single-particle energy «,, 
owing to degeneracy of the energy levels. In the present microcanonical 
context the question we want to address is: in how many ways can the NV 
number of particles making up the gas be distributed among the single- 
particle states subject to the value U of the total energy? This will give us 
the number of microstates (W) for the given values of U and _ JN, in addition 
to the specified volume (V) of the gas, from which the entropy S(U,V, NV) 


by making use of the Boltzmann formula (2-2). 


While this question poses a difficult problem in the case of BE and FD 
statistics, with their symmetry requirements on states of the system of 
particles as a whole, one can more conveniently arrive at an answer by 
first looking at the number of ways the particles can be distributed among 


the various single-particle energy levels, where an energy level, ¢«; (i = 
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0,1,2,---), can have a degeneracy, say, g; (which means that there are g; 


number of states |e,) such that €,, = ea2 = €; (a1, a2 = 1,2,---g;)). 


In a typical such distribution, let there be n number of particles in a g- 
fold degenerate level «. We first try to figure out the number of ways the n 
number of particles can be so distributed among the g number of single- 
particle states belonging to the energy level subject to the BE or the FD 


symmetry requirement as the case may be. 


It turns out (refer, for instance, to [85], chapter 2) that the number asked 


for is 


(g+n-—1)! 


Gopal (3-62a) 


[BE statistics :] w(n,g) = 


g! 


ea cman rt 


(3-62b) 


where, in the FD case, the degeneracy g must evidently be > n, since at 


most one fermion can occupy any specified single particle state. 


For a given sequence of degeneracies {g;} (depending on the Hamiltonian 
of the system under consideration) and for a given sequence of occupa- 
tion numbers (n; (i = 0,1,2,---)) of the energy levels, the number of ways 
in which NV number of particles can be distributed among all the differ- 
ent single-particle states belonging to the various energy levels (€0, €1,---), 
taking into account their respective degeneracies, subject to a specified 


value (U) of their total energy, can be expressed as a product of factors of 
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the form 
Wini} = [[e(i. 9), (3-63a) 


where 
Soni = N, a En = U. (3-63b) 


Since there can be many distinct sequences {n;} satisfying (S-63b), there 
can be equally many contributions of the form (3-63a) to the total number 


of microstates (W) we are after: 


W = S>W({ni}). (3-63c) 
{ni} 


For a sufficiently large value of N (— oo), it turns out that there exists one 
particular distribution (corresponding to a sequence, say, {n;}), whose 
contribution to W is overwhelmingly large compared to the contribution 
of all the other distributions (i.e., sequences {n;} satisfying (S-63b)) taken 
together. Based on the fact that all microstates are equally probable in the 
microcanonical ensemble, the ratio Whin) for any given distribution {n,} 
can be interpreted as the probability of that distribution in the ensemble. 
Thus, if we can determine the most probable distribution {n*}, then W 


can be approximated as 


W x W({ni}"). (3-64) 
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The most probable distribution can be determined by maximizing W ({n;}) 
(eq. (3-63a)) with reference to all possible distributions. More precisely, 
one maximizes InW({n;}), rather than W({n;}) itself, which suffices to 
yield the thermodynamic parameters of the ideal gas. For sufficiently 
large NV, this can be done by making use of Stirling’s approximation in 
the factorials occurring in the expression on the right hand side of (3-63a) 
(making use of (3-62a) or (3-62b)) as the case may be. The result of this 


exercise turns out to be [85] 


w= ent Be Ae 7 (3-65) 


where the upper (resp. lower) sign is to be chosen for an ideal gas made 
up of fermions (resp. bosons), and where 7,3 appear as undetermined 


Lagrange multipliers in the maximization procedure mentioned above. 


1. The term ‘distribution’ is used in the present context in a sense differ- 
ent from that in sections 3.3.3 and 3.3.4, since the n,’s are now the 


numbers of particles belonging to the various energy levels. 


2. For a gas enclosed in a sufficiently large volume, the single-particle 
energy levels form a quasi-continuum, in which case it is more ap- 
propriate to consider bunches of energy levels within narrow energy 
windows ordered consecutively, instead of single energy levels ([67], 
chapter 8). The degeneracy numbers g,; then refer to the number of 


independent single-particle states within these energy windows. 


3. On the face of it, the use of Stirling’s approximation in the derivation 
outlined above may appear questionable. In reality however, the de- 
generacy numbers g;, referred to the energy windows mentioned above, 


are enormously large, with the associated occupation numbers nj; be- 


205 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


ing correspondingly large. In other words, the use of Stirling’s approx- 


imation is justified in the thermodynamic limit. 


4. Referring to the thermodynamic limit once again, the maximization of 
InW ({n,;}) justifies (3-64) in the sense of the large deviation principle 


(refer to sec. 5.5). 


The Lagrange multipliers 7,3 can be obtained back in terms of the ther- 
modynamic parameters worked out from the most probable distribu- 
tion (3-65). The latter are obtained from the Boltzmann formula, which 


in the present context reads (refer to (3-64)) 
S = kp lnW({n;}). (3-66) 


The end result of the exercise identifies 3 as 4 where T = (2. is the 


temperature of the gas, and 7 as —Gy where yp stands for the chemical 


potential per particle of the gas (refer to [85)). 


Thus, we finally have, 
n= =———_.. (3-67a) 


4 


Recalling that n* is the most probable number of particles in a single- 
particle energy level corresponding to g; number of degererate eigenstates 
|e,), the most probable number of particles in a single-particle state is 


obtained as 


(3-67b) 
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where, once again, the upper (resp., lower) sign corresponds to the FD 
(resp., BE) case. This agrees with the result (3-49), allowing for a slight 
difference in notation and interpretation. While (3-49) gives the mean 
number of particles in the single-particle state |«,), n, in formula (3-67b) 
above gives the most probable number of particles in the same state. The 
two become identical in the thermodynamic limit N — o,V > o, + > 
finite. It then follows that the thermodynamics of the ideal FD or BE gas, 
as obtained in the grand canonical ensemble (sec. 3.3.3) is reproduced in 
the description based on the microcanonical ensemble as well, since all 


thermodynamic variables can be derived from the most probable or the 


mean number of particles in the various possible single-particle states. 


The number of microstates (W), all of which are equally probable, receives 
the dominant contribution from the most probable distribution, based on 
which we have obtained the ‘most probable’ value of the occupation number 
n*. For the other distributions, the occupation numbers differ from the 
most probable value and it makes sense to look at the mean values of the 
occupation numbers, as given by (3-49). The fact that the most probable 
and the mean values agree, indicates that the fluctuations in the occupation 
numbers are vanishingly small in the thermodynamic limit. This, indeed, is 


the criterion for the validity of the derivation outlined above. 


It now remains to check that the MB limit indicated in sec. 3.3.4 is re- 
produced for the classical ideal gas in the microcanonical ensemble. This 
can be seen by observing that, in the classical limit (3-60b), the state 


counting can be performed by looking at the gas molecules as identical 
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but distinguishable particles like a set of billiard balls (whose positions 
distinguish them). For such particles, the state counting in the micro- 
canonical ensemble can be performed by first attaching distinguishing 
tags to the particles, counting the number of microstates in a distribu- 
tion, and then dividing by N! by way of forgetting about the tags. One 


thereby obtains , for the classical case, 
[classical (MB) statistics :] W{n;} = I] a (3-68) 


from which the formula for the occupation numbers in the most probable 


distribution is obtained as (see, for instance, [85]) 


n= NS me (3-69) 


From this, one obtains the probability of a single particle state |«,) as 


e Pea 


Pea = > e— Bea’ (3-70) 


the Maxwell-Boltzmann formula for the energy distribution among the 


molecules of an ideal gas (refer back to (3-58b)). 


Analogous to the FD and BE cases, the derivation involves two Lagrange 
multipliers 7, 3, where 5(= st) is proportional to the inverse temperature 
and 7( = In[z >>, e **]) equals —Gp, 4 being the chemical potential per 


particle of the gas. 
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3.3.6 The ideal FD gas 


As mentioned in sec. 3.3.3.1, and as evident from the expressions (3-57), 
the thermodynamic behavior of the ideal FD gas can be completely de- 
scribed in terms of the single-particle occupancy probability (refer to 
eq. (S8-46b), where one has to put m, = 1 to obtain the single-particle 
occupancy probability and has to take the upper sign for the FD gas), 


referred to as the Fermi function, 


(3-7 1a) 


Here the suffix a is once again used to refer to single-particle states and 
actually stands, in the thermodynamic limit, for four indices (p, po, p3, ms) 
of which the first three are continuous variables, each ranging from —oo to 
oo (recall that each had a range from 0 to oo to start with, which was then 
converted to (—oo, co) by inserting a factor of $ in the relevant integrals 
where the integrands involved even functions of these variables), and the 
fourth one is a discrete variable taking up g(= 2s + 1) number of values 
ranging from —s to s (we assume that the spin is the only internal mode of 
excitation). It is convenient, at times, to change over from p; to k; = 4 (i = 
1,2,3) where the vector k with components k,, k2,k3 is referred to as the 
‘wave vector’ of a single-particle state. Note that the energy «, depends 


only on the indices (p,, p2,p3) (or, equivalently, on the wave vector k), and 


not on m,. The Fermi function f, then refers to the occupancy probability 
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of any one of the g number of single-particle states with wave vector k: 


fims = Wee (3-7 1b) 


Thus, the Fermi function depends essentially on only one variable p = |p| 
(or k = ?, apart from the parameters £, y), or else on € = ae as do the 


integrands in the expressions for the thermodynamic functions in (3-57). 


In general, for arbitrarily specified values of the parameters 3, 4, (one com- 
monly uses the fugacity 7 instead of , as a relevant parameter), the ther- 
modynamic functions cannot be evaluated explicitly, and more precise 
information is obtained in two limiting situations — those pertaining to 
the so-called ‘weakly degenerate’ (y small) and the ‘strongly degenerate’ (7 
large) FD gas, the former being close to the classical limit where the mean 
occupancy of single-particle states is low, and the latter to the contrast- 
ing situation where the mean occupancy of a set of single-particle states 
with energies up to a certain maximum value is close to unity (recall that 
the occupancy of a single-particle state can be either zero or unity for a 


fermion). 


3.3.6.1 Weakly degenerate FD gas 


For the weakly degenerate FD gas, the parameter y is small, and all the 
thermodynamic functions can be expressed in the form of power series in 


y, the first few terms of any of these series being relevant from a practical 


point of view. For instance, the series for the mean particle density n = x 


is obtained by expanding the integrand in the second equality in (3-57) 
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(recall that in all occurrences of the sign ambiguity, ‘+’ or ‘+’, the upper 
sign is to be taken for the FD case) and integrating term by term. This 


gives 
n= — = ahily), (3-72a) 
‘c 


where f(y) stands for the series 


0° I 
ine ay, (3-72b) 


[3 


~ 
BR 


One thus obtains the series expansion of 7 for small y, 


2 

_ gy Y Y 
= L= ere) 3-72c 
. ra W/2 | 3/3 ) ( ) 


The pressure of the weakly degenerate Fermi gas (as the FD gas is at times 
referred to) can be similarly worked out from the fifth equality in (3-57) by 
expanding the integrand on the right hand side in a power series in ye" 


and integrating term by term. Defining the function f.(7) by the series 


(oe) 


i 
fly = (y=. (3-73a) 
= [2 
one obtains 
gkpT gkpTy q a 
a = t= = ete hh 3-73b 
Pp xB fol) ds, ( 4/2 9/3 ) ( ) 


Making use of these expressions for i and p, one arrives at the equation 
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of state of the weakly degenerate FD gas: 


2 1» | ve), (3-73c) 


pV =NkpT (14 5 16 


a 
dja! 


which gives the quantum correction over the classical ideal gas equation 


of state. Noting from (3-72c) that y ~ g7'nA#. (Ar = ), one finds that 


h 
V2nmkpT 
the leading correction involves the third power of the Planck constant, 


while the next correction involves the sixth power. 


The expression (3-72c) can be inverted in order to obtain a series expansion 


for the fugacity 7 in terms of the dimensionless parameter ¢ = Le 


nd3 
)C7 +--+) Ga Ve 


1 
v= sl 4 3V3 g 


=P ). (3-74) 


1 
1+ — = 
aa 
The fugacity is a decreasing function of the temperature, going to zero as 


To. 


Fig. 3-3 compares schematically the variation of pressure (p) with volume 
per particle (v = a) for a weakly degenerate ideal FD gas with that for 
a classical ideal gas, both at the same temperature (an isotherm for a 
weakly degenerate BE gas is also shown, refer to sec. 3.3.7.1). One ob- 
serves that, for any specified value of v, the FD pressure is higher than 
the classical pressure, which is explained by the fact that the restriction 
on the occupancy of single-particle states by more than one particles re- 
sults in an effective repulsive force acting between the particles of the gas 


(the ‘Fermi repulsion’). 


Greater details pertaining to fig. 3-3, and to the statistical mechanics of 
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ans) 


O >V 


Figure 3-3: Depicting isothermal curves (pressure (p) against volume per particle 
w= 4 = 2) at a fixed temperature; schematic) for (A) the classical ideal gas (B) the ideal 
Fermi (FD) gas, and (C) the ideal Bose (BE) gas; in the region of large values of v the 
FD and the BE gases are weakly degenerate and the three curves differ little from one 
another - the small deviations of (B) and (C) from (A) correspond to quantum corrections 
given by formulas (3-73c) and (3-82); for smaller values of v, the deviations of (B) and 
(C) from (A) are more marked, especially for (C) corresponding to BE condensation; the 
classical theory of the ideal gas loses its relevance in this regime of strong degeneracy; 
note that the pressure of the Fermi gas for any given v is larger than that of the classical 
gas, while the pressure of the Bose gas is smaller; this corresponds to an effective repul- 
sive interaction among the particles making up a Fermi gas, and an effective attractive 
interaction for a Bose gas. 


FD and BE gases, are to be found, among other standard references in 
equilibrium statistical mechanic, in [117], chapter 7, and in [67], chapters 


8, 11, 12. 


While formulae (3-73c), (3-82) (for the latter, see sec. 3.3.7.1 below) give 
the quantum correction to the classical equation of state for an ideal 
gas (FD or BE), we will have a look at the quantum correction for a real 
gas made up of weakly interacting particles (with, however, a hard core 


repulsive interaction), in section 4.2. 
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3.3.6.2 Strongly degenerate FD gas 


The thermodynamic behavior of the strongly degenerate Fermi gas (as the 
FD gas is at times referred to) differs markedly from that of the classical 
ideal gas. This corresponds to small values of the ratio ‘nt where ep, the 
Fermi energy (or, the Fermi level) is defined as the chemical potential j: at 
T — 0 for any given value of the mean particle density n = N For ‘nt << 
1, pw is close to ey, and one can expand all the relevant thermodynamic 


quantities in powers of 2, 
€F 


At the absolute zero of temperature the Fermi function f(e) (refer to (8-7 1a); 
the notation is slightly changed now, with f(«) denoting the probability of 
occupation of a single-particle state with energy ¢) is discontinuous, hav- 
ing the value unity for « < e and zero for « > eg. In other words, all 
single-particle states with energy less than the Fermi energy are occu- 
pied while states at higher energies are all unoccupied, as depicted in 
fig. 3-4. It is straightforward to evaluate the Fermi energy in terms of the 
mean particle density from the second equality in (8-57) which gives, with 


B + oo (T =0), 


ed. (3-75) 


(check this out). 


The internal energy and the pressure of the FD gas at T = 0 can be 


similarly evaluated from the third and the fifth equalities in (3-57), that 
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fo 


>é 


O we 


Figure 3-4: Depicting schematically the Fermi function (f(c¢); refer to formula (3-7 1a) 
in which the discrete variable ¢«, is replaced with a continuous variable ¢) at (A) absolute 
zero (T = 0), and (B) a low temperature kpT << ep, for a fixed value of i = su at T =0, all 
single-particle states from « = 0 up to « = eg are occupied, while higher energy states are 
unoccupied; at a non-zero low temperature (: slightly less than ef, see formula 3-77a) a 
few of the states with energies less than but close to 4 are vacated and a few states with 
energies slightly larger than 4 are populated; at « = y the value of the Fermi function is 
al 


3° 


give 


4ngV ,2m,32 5 38 
U|r=o = (=) "ep = per, (3-76a) 
87g /2m,3 &  4N 
_ — —. == -——— -7 
P\|r=0 15 aa) = Fp (3-76b) 


(check these out as well). 


In these expressions, U|;7—) stands for the energy of the Fermi ‘sea’, made 


up of occupied single-particle states with energy ranging from 0 (the ‘bot- 
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tom’ of the sea) up to ep (the ‘top’ of the sea), while p is the pressure 
resulting from the effective repulsive interaction (the ‘Fermi repulsion’) 
that prevents all the particles to collect at the bottom of the sea, both the 
two being specifically of quantum mechanical origin. Finally, the entropy 


is zero at T = 0, in keeping with the third law of thermodynamics. 


If the temperature is made to increase from T = 0 to a small positive value, 
then the chemical potential undergoes a small change and the Fermi 
function looks as in fig. 3-4, with a sharp bend around « = up (recall that 
in the thermodynamic limit, the single-particle energies effectively form 
a continuum), where one has f(« = .) = 5. Energy-wise, the particles in 
the strongly degenerate FD gas may be assumed to fill up the Fermi ‘sea’ 
from bottom upward, up to energies close to « = , where, near the top 
of the sea, a small number of particles cross the level « = ,, to energies 
slightly larger than 1. As mentioned above, one can expand the various 


thermodynamic quantities of the strongly degenerate FD gas in powers of 


kpT 
€F 


about their values at T = 0 (see, for instance, [85]). Thus, with a given 
value of the mean particle density x the chemical potential varies with 


temperature as 
w= ep(1— =(—)? +--+), (3-77a) 


while the internal energy varies as 


5a? kpT 
DSU 30s pe), (3-77b) 
12 €F 


The last relation leads to the following approximate expression for the 
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specific heat of the strongly degenerate FD gas 


OU 


kpT’ 
aT) VN ; 


rT 
~ —Nkg—. (3-78) 
2 €F 


Cy = ( 
Before I conclude this section I need to mention that there is an intermedi- 
ate regime in between the regimes of weak and strong degeneracy where 
one has to evaluate the integrals in (3-57) numerically so as to under- 
stand the behavior of a Fermi gas over the entire ranges of the relevant 
variables, such as the temperature (7), the chemical potential (1), and 
the mean number of particles per unit volume (n). As a particular point 
of interest, it is worthwhile to know how the chemical potential varies as 
the temperature is made to change from O (strong degeneracy) to infinity 
(weak degeneracy), for any given value of 7. The following scheme will 
give you the trend of variation of the parameters i, 6, and y = e*“ with 


temperature for a fixed value of 7: 
T: 0-400, uw: E> —oo, Bu: co -co, y: —w > 0. 


The important thing to note in this scheme is that, for large T, », behaves 


like —2kpT InT, and so 6p behaves like —3InT. 


The chemical potential (per molecule) of the classical ideal gas is obtained 


from (3-23) as 


Os 


b= -T (sy) uv.n a FH, (3-79) 


kpT | 


since, in the large N limit, one has N > N. 
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Fig. 3-5 depicts schematically the variation of ,, with T for fixed n indi- 


cated above. 


>= 


Figure 3-5: Depicting the variation of the chemical potential of a FD gas with tem- 
perature T for a fixed value of the mean particle density n (schematic); «, denotes the 
Fermi energy; at high temperatures (T — oo), y varies like that of the classical ideal gas. 


3.3.7 The ideal BE gas 


As in the FD case, the thermodynamic behavior of the ideal BE gas is 
close to that of the classical ideal gas for sufficiently small values of the 
fugacity (7 ~ 0), and one can obtain a complete description of the ther- 
modynamic functions by means of series expansions in powers of y (see 
sec. 3.3.7.1 below) in much the same way as in the FD case, with the dif- 
ference that now one has to choose the lower signs in the equations (3-57) 
when confronted with sign choice (‘+’ or ‘+’). With increasing values of 
y the BE gas departs more and more from the classical description till, 
as the temperature is made to decrease below a certain value (with the 
mean particle number density held fixed) there occurs a phase transition 


in spite of the fact that the gas is made up of non-interacting particles. 
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This is the most conspicuous phenomenon relating to the strongly degen- 
erate BE gas, as briefly outlined in sec. 3.3.7.2 below, differing dramati- 
cally from the behavior of the classical ideal gas that does not undergo a 


phase transition at any temperature, however low. 


3.3.7.1 Weakly degenerate BE gas 


Proceeding in a manner analogous to the FD case, and defining the func- 


tions g:(7), g2(y) by means of the series 


i 


[oe) 1 [oe) 
n= 24, ay=>24, (3-80) 


3 
l=1 [2 i=1 


the expressions for the mean particle number density and the pressure 


of a weakly degerate BE gas are obtained as 


_ Ng gy ¥ ny 
Se oe ae cares p 3-8la 
n V ye) | 2/2 3/3 ) ( ) 


and 


— gk oe gkpTy 


= oy) = tte Ste, (3-81b) 
T T 


4/2 9V/3 
from which one obtains the equation of state 


2 1 


V=NkpT(1 —- — 
P NkpT ( 0/3 16 


( \yP eee). (3-82) 


| 
4/2! 
Once again, this constitutes a correction over the classical ideal gas equa- 


tion of state pV = NkgT, and the leading correction corresponds to a re- 


duction of the pressure (calculated for given values of NV’, V and 7) when 
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compared with the classical value, in contrast to an increase in the FD 


Case. 


Analogous to (3-74), one has the following expansion of the fugacity y of a 


waekly degenerate BE gas as a power series of the parameter ¢ = w an : 
1 1 1 
=¢(1 + =)? +--+), 3-83 
se Cla: eo 4 aa ) (3-83) 


As in the FD case, p: goes to —co like —3kgT nT when T is made very large. 


3.3.7.2 Strongly degenerate ideal BE gas: Bose-Einstein condensa- 


tion 


As mentioned above, the strongly degenerate ideal BE gas differs markedly 
from the classical ideal gas in that, as the temperature is made to de- 
crease for a given value of the mean particle number density x a phase 
transition takes place due to a discontinuous crowding of the particles 
in the single-particle state of the lowest energy (such a crowding is pro- 
hibited in the FD case in virtue of the Pauli exclusion principle). As seen 
from formula (3-49), the mean occupation number in the ground state, 
which we assume to be at energy €) = 0 (this actually means that we write 
js for ps — €9; the zero of the energy scale is already fixed by assuming the 


energy of the vacuum to be zero) is given by 
ig =, (3-84) 


which shows that fip9 diverges as y (= ec") goes to 1. Thus, as the tem- 


perature is made to decrease, y remains within the range 0 < 7 < 1 (in 
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contrast to the FD case where y varies from 0 to ov), i.e., 4. remains in 
the range —oo to 0 and, for y > 1, one cannot replace the sums over the 
single-particle states in (8-52a) to (3-52d) with the corresponding inte- 
grals as in (3-57). This is because of the anomalous population of the 
single-particle ground state, when the ground state alone gives a finite 
contribution to the single-particle sums as V — oo while the contribution 
of all the other states taken together can be approximated by integrals as 


in (3-57). 


All single-particle energies (including the chemical potential per particle) 
will henceforth be defined relative to «9. In the thermodynamic limit, ¢o itself 


goes to zero. 


Looking at the second formula in (3-57), which has been obtained by ap- 
proximating a sum over discretely distributed single-particle states with 
an integral over continuously distributed energies, one notes that, with 
ju < O its right hand side necessarily satisfies the inequality 


fas | 5-4. (3-85a) 
0 0 


eben) —1 — f, ef —-1 2 


where ¢(3) is given by 
es 2.612. (3-85b) 


This tells us that the replacement of the sum with an integral mentioned 
above is inconsistent for y — 1 and for a specified value of N if 6 is al- 


lowed to be arbitrarily large (reason out why). In other words, one expects 
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that the said replacement will work down to a temperature given by 


De h 
-=)).= 9.612 De =—__ —__], (3-86a) 
gv 7 is OnmkaT)E 
which yields 
h? ,N 2 
T=7, 200 —__(—__\" 3-86b 
0 mk “gv? ’ ( ) 


another way of expressing the same result being 


3 gV h 


Ap, = gi(1)=— (Ay, = ———z). 3-86 
To gil ag ( To OnmkaT)® ( Cc) 


In other words, the chemical potential ;, attains a value very close to zero 
(i.e., y very close to 1) as T — Tp from above, up to which point the ground 
state population continues to be of a microscopic magnitude just like any 
other single-particle state. For T < 7p, on the other hand, the chemical 
potential continues to tend to zero (remaining negative and very close to 
zero all the while), while the single-particle ground state becomes macro- 
scopically populated, i.e., the number of particles populating it becomes 
of the order of \’, and keeps on increasing down to T = 0. Denoting the 


ground state population at any temperature T(< 7) by No(T), one obtains 


[T < To] NM = No(T) + angv (2)! | ae (3-87) 
0 ere 1 


or, on using the defining equation for 7p (refer to formula (3-86a)), 


| 


Nie 


[T < To] N=No(T)+N(—)?. 


(3-88) 


aes 


0 
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Thus, the ground state population increases from zero (or, more pre- 
cisely, from an infinitesimally small value) at T = T) to V as T is made to 


decrease to 0. 


An estimate of the chemical potential for T < 7p is obtain by referring to 
formula (3-49), from which the mean occupation number for the ground 
state (€9 = 0; however, « is actually not zero — refer to notes follow- 
ing (2-39b) and (3-40); all occurrences of y are to be interpreted to stand 


actually for j. — eo) comes out as 


il 


no= 9 


This can then be equated to the mean ground state population as given 


by the formula (3-88), which gives 


[T<T] pe - Fe (1-(2)2)", (3-89b) 


This formula does not hold for a very small range of temperature just below 


T =T. 


A good way to visualize a strongly degenerate BE gas is to imagine it asa 
two component fluid, where one of the components is the normal one, with 
an occupation of continuously distributed single-particle states while the 
other, which appears at T < 7p (for any given value of the mean particle 
density “), is the ‘condensate’ made up of a macroscopic population of 
particles in the ground state. At such low temperatures, the relative pop- 


ulation of the ground state in comparison with a few of the next excited 


223 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


states is determined, not by the value of 3 but mainly by jp, which sat- 
isfies |u| < «, (in accordance with the convention we have adopted, this 
actually means |1 — €o| < €: — 6; for a BE gas, ju < € at all temperatures 
for any given value of * which, according to our convention, we will write 
as yw. < 0). Incidentally, one can alternatively describe the scenario of BE 
condensation by looking at the mean particle density x (at a specified 
temperature) as the relevant parameter rather than the temperature at 
N 


a specified value of “~, where the condensation occurs as 7 is made to 


increase above a threshold value. 


Looking at formula (3-49), and focusing on the mean occupation of the 
ground state for a BE gas, for which one has to choose the negative sign in 
the denominator of the right hand side, one obtains the condition p — e9 < 0 
which is expressed as yp < 0 (on referring all single-particle energy expres- 
sions relative to the single-particle ground state) since a violation of this 
condition would mean that the occupation number for the ground state 
would be negative, which is a contradiction. If this condition is satisfied, 
then the occupation number for any other single-particle state necessarily 


turns out to be positive (decreasing with increasing energy). 


The thermodynamic properties of the strongly degenerate BE gas can be 
worked out by referring to the two-component picture, where the contri- 
bution of the first component, made up of particles distributed continu- 
ously among the single-particle states, is obtained by taking into account 
the fact that the mean particle density in it is temperature dependent for 


T < To (one can put y & 1 in calculating the relevant integrals in this 
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temperature range), while the second component, made up of the con- 
densate, does not contribute to most of the thermodynamic functions. 
For instance, taking into account the contribution of the condensate as 
distinct from that of the excited states (refer to [67]) one obtains, for the 


strongly degenerate BE gas in the thermodynamic limit, 


Pg f2my3 = 1 ee ae 7 ; 
kek an (=> ) / de €2 In(1— ye “*) Vv In(1 — 4), (3-90a) 
N 27,3 [* e2 1 ¥ 
= 2 ——}? d p) = b 
V ™(>3) / © ay lebe — 1 V1i-y 200) 


where 7 is given in terms of the mean occupation number of the ground 
state by (8-84). The equation of state of the strongly degenerate BE gas 


can be obtained from these by eliminating 7. 


On consistently working out, in the manner indicated above, the expres- 
sions for the thermodynamic functions at temperatures below and just 
above the transition temperature 7) (for any specified value of “), one 
finds that these are given by two different sets of formulas. A similar sce- 
nario is encountered when the thermodynamic functions are expressed 
as functions of the specific volume (v = +) or the mean particle density 
(n = ;,). For instance, the variation of pressure (p) with specific volume 
(v) at any given temperature (i.e., the isothermal curve) of a BE ideal gas 
(obtained from formulae (3-90a), (3-90b)) is shown by curve (C) in fig. 3-3 
where one observes that, below a critical volume (corresponding to the 
onset of BE condensation), the pressure remains constant, while above 


the critical volume the pressure variation approaches that of the classical 
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ideal gas with increasing values of v. 


As a concrete example, the expression for the chemical potential 4, which 
is given by (3-89b) for T < 7p, has a different functional dependence on T 
for T just larger than 7). This is obtained by adding and subtracting the 


1 
€2 de 
eBe—] 


integral [>~ to the one appearing on the right hand side of the second 
equality in (3-57) and noting that Ong (2m)? times this integral is simply 


given by V (£)? (refer to the second term in (3-88)). One thereby obtains 


T 3 Im.3s [M4 1 1 
[T > To] N=N(z)t + 2nev(Ge)? f €2 ee — ele (3-91) 
In this expression, the integrand is appreciable only for small values of « 
near zero owing to the fact that p is small near T = 7o, and hence one can 
expand the exponentials, retaining only the first degree terms in «¢, 1 and 


replacing the expression within the brackets with ea 


1 4 P 
a me This gives, 


1 1 
Late Tal) ~ Be! 


Nl 


de. (3-92a) 


T \3 2n,3 f° 
[T’ > To] N=N(G)t + 2nov(GE)? f € 


from which one obtains, after integration and rearrangement, 


AM kpT Doe 2 og hpl To,3\2 
T>T) pe—-— ; 2-1) - 1—(—)2)°. (8-92b 
1. This result holds for a small range of temperature above T = To. 
2. The formulas (3-83), and (3-89b), (8-92b) give the variation of 4 with 


temperature for a BE gas in the weak and strong degeneracy regimes 
respectively, while the variation in the intermediate regime is to be ob- 


tained by numerical evaluation of the second equality in (3-57) (choos- 
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ing the lower of the two signs in the denominator on the right hand 
side). Contrary to the FD case (refer to fig. 3-5), the chemical potential 
(or, more precisely,  — €9) of a BE gas turns out to be negative at all 
temperatures for any given value of i = x attaining a value very close 


to zero below the transition temperature 7p. 


Similar results are obtained for the internal energy U, which is seen to be 


continuous across the transition (JT = To): 


3 ga(l T .3 
[T<T] Ux Ge Niati(7)! (3-93a) 


where the numerical value of ( 3 ath) is 0.77 (approx.; refer to(3-80), (3-85b); 


the series for g2(1) is the zeta function with argument 3, and has a value 


¢(3) © 1.342). For T slightly above T,, on the other hand, one obtains 


3 go(1 T.: ,3 Tht 
[PS let =1y) << 25] (Se) baTa() oe (= (91(1))) Nis To(=) a(( 
<0.TTON no =)? = 08 IsTH(F)*((F) aa 
(3-93b) 


In order to derive (3-93a), note that the particles condensed in the ground 
state do not contribute to the internal energy, while those distributed con- 
tinuously in the higher energy states contribute an amount that can be 


approximated by the third expression in (3-57), in which one has to put 


yd, 


As for (3-93b), one has to expand the integrand in the third equality in (3-57) 
in a Taylor series in 4 and ignore terms of the second and higher degrees. 


The first term (independent of 4) can be worked out by putting y = 1, which 
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gives a contribution to U given by the right hand side of (3-92a), but now 
with T > To, while the second term (linear in y) can be evaluated by making 
use of (3-92b), after performing an integration by parts. One observes that 


U is continuous at T = To. 


With these expressions for the internal energy, one can work out the spe- 
cific heat Cy on the two sides of the transition which, like the internal 
energy, is found to be continuous across the transition. The derivative of 
the specific heat, (ve however, is discontinuous, with the disconti- 
nuity given by 

MY) = rent — (GDrare © Elly = 3.66%, (8-94) 


0 0 


where we have used the numerical value of g;(1) = ¢(3), which is ~ 2.612. 


Fig. 3-6 depicts schematically the variation of — with normalized tem- 


perature (7) in the neighborhood of the transition temperature (7>). The 


OCy 


cusp at i is indicative of the discontinuity of the derivative A( 


), simi- 
lar discontinuities being observed in real life systems, where the particles 


in the fluid have a non-zero interaction. 


The BE condensation is characterized by a non-zero latent heat, on the basis 
of which it is, at times, identified as a phase transition of the first kind. On 
the other hand, the discontinuity of the specific heat Cy is cited in favor of 


the transition being identified as one of the third kind [114]. 


Digression: BE condensation in interacting systems. 


228 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


ai|5 


Figure 3-6: Depicting the variation of the specific heat (in units of kp) of a strongly 
degenerate BE gas as a function of the normalized temperature i in a neighborhood of 
the BE condensation temperature 7); the cusp in the graph at T = Tp is indicative of the 
BE condensation, where the derivative of the specific heat is discontinuous. 


We will now digress briefly to point out how the BE condensation in an 
ideal Bose gas compares with the corresponding phenomenon in an in- 
teracting system. The theory of weakly interacting Bose gases will be 
outlined in section 7.3 in a perturbative scheme developed by Bogoliubov 
(see, in particular, sec. 7.3.3.2) where it will be seen that the interaction 
among the constituents of the gas causes a shift of the ground state en- 
ergy of the gas and also modifies in an important way the excited states, 
resulting in a notable change in the thermodynamic properties of the gas. 
The ground state no longer corresponds to all the particles occupying the 
single-particle ground state (one with momentum p = 0), but the mean 
number of particles in the p = 0 state is close to N, i.e., the ‘condensate 
fraction’ (see chapter 7, sec. 7.4.4) is close to unity. Notably, however, the 


thermodynamics of the system gets modified in a non-trivial manner, as 
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seen from the fact that the compressibility below the BE transition tem- 
perature is finite, in contrast to that of an ideal gas, which is infinitely 
large (refer to chapter 7, fig. 7-1). What is more, the excitation spectrum, 
of the system gets modified. While the excitation spectrum for an ideal 
gas is described in terms of the distribution of the particles of the gas 
among the single-particle momentum states (energy «(p) = Py, that in 
the case of the weakly interacting gas is described in terms of the distri- 
bution of elementary excitations where the energy «(p) of an elementary 
excitation is given by a linear rather than a quadratic dependence on the 
magnitude of the momentum. In consequence, there arises the possibility 
of superfluidity of the gas, when flowing through a capillary tube with a 
velocity less than the velocity of sound in it (superfluidity is not possible 
in an ideal gas even below the BE transition temperature). In summary, 
the Bogoliubov theory for a weakly interacting Bose gas tells us that the 
features of the BE condensation for such a gas resemble those for an 


ideal gas while, at the same time, being of a distinctive nature. 


In the case of a strongly interacting system such as liquid He-4, the the- 
oretical description of BE condensation is less fully developed (the ob- 
served transition temperature for liquid He-4 is, however, not too differ- 
ent from the transition temperature for an ideal gas made up of particles 
with the same mass as the Helium-4 atom). In particular, it is not easy to 
apply the general criterion for BE condensation (see chapter 7, sec. 7.4.4), 
to specific cases such as liquid helium. In particular, the relation between 
BE condensation and superfluidity of liquid helium needs further clari- 


fication since the application of the general criterion (there are several 
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equivalent forms) for BE condensation to liquid helium poses a difficult 


problem and does not yield unambiguous results. 


3.4 Statistical mechanics: simple applications 


We will now have a look at a few simple but useful applications of the 
principles of equilibrium statistical mechanics outlined so far, where we 
make use of the equilibrium ensembles introduced in chapter 2, and the 
results stated in sec. 3.1 (systems described by sums of independent 
Hamiltonians) along with those in sections 3.2 and 3.3 (the ideal gas in 
the classical and quantum theories). The simple but basic principles 
expounded in these foregoing sections will be seen to be relevant in a 


number of areas of practical interest. 


At the end of this chapter we digress to look at an ‘application’ of a dif- 
ferent kind — a demonstration how the canonical distributed can be jus- 


tified on the basis of the microcanonical ensemble. 


3.4.1 Specific heats of crystalline solids 


The classical theory of specific heats of solids starts with the Dulong-Petit 
law: the molar specific heat at constant volume of a crystalline solid 
equals 3R, where FR stands for the universal gas constant (= Akg, where 
A denotes the Avogadro number). This is explained by assuming that 
each atom making up the crystal (we assume that the constituents do 
not have an internal structure that may contribute to the specific heat) 


is effectively a three-dimensional simple harmonic oscillator oscillator, 
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equivalent to three independent 1D oscillators. Each these three inde- 
pendent 1D oscillators contributes an amount kpT to the mean energy 
of the crystal (resulting from two independent quadratic terms in the 
Hamiltonian; refer to (2-70), (2-73)). The total internal energy made up of 
contributions from the 3N number of 1D oscillators describing the vibra- 
tions of N number of atoms constituting the crystal is then 3NkgT, which 


explains the value 3R of the molar specific heat. 


This is an instance of the additivity of the mean energy of a system made 
up of independent constituents, all having the same partition function z 
(see (3-4b) and note following this formula); note that we are now referring 
to the classical context. The atoms in a crystalline solid are distinguished 
by their locations in the crystal lattice. In contrast, the atoms making up a 
gas are not so distinguished, and one needs the formula (3-5) to explain its 


properties in the classical theory. 


While the Dulong-Petit law appears to provide a correct explanation of the 
specific heat of crystalline solids at relatively high temperatures, giving a 
constant value independent of temperature. At low temperatures, how- 
ever, the specific heats of solids are found to decrease with temperature, 


tending towards the value Cy — 0 as T — 0. 


For a correct explanation of the decrease of the specific heat of a crys- 
talline solid with decreasing temperature, one needs to resort to the quan- 
tum theory, taking into account the discrete character of the the possible 


energy values of an oscillator. 
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3.4.1.1 Specific heats of crystalline solids: Einstein’s theory 


A first step in developing a correct theory of crystalline specific heat is 
to assume that the N number of atoms making up the crystal can be 
looked upon as an assembly of 3N number of independent 1D harmonic 


oscillators, all of the same frequency, say w. 


The canonical partition function of a harmonic oscillator of mass m and 
frequency w, looked at as a quantum mechanical system, is given by (2-28). 
For an assembly of 3N independent oscillators, all of the same frequency 
and mass, the partition function is obtained as (refer to the derivation of 


formula (3-5)) 


(3-95) 


One can now obtain the mean energy (i.e., the internal energy in the 


thermodynamic limit) as U = a , from which the specific heat works 
out to 
OU hw \2 e- pT 
Vv OT a( 7) (1—e tr)” ( ) 


(check this out; the molar specific heat is obtained with N = A). This 
expression goes over to the Dulong-Petit value at T — oo, as it should (the 
classical limit). At T — 0 it does go to zero, but exponentially rapidly (Cy ~ 


3Nkp te ¥P), which is contrary to what is observed experimentally. 


Eq. (3-96) is referred to as Einstein’s formula for the specific heat of a 


crystalline solid. 
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3.4.1.2 The Debye theory 


It is not quite the right thing to identify the harmonic oscillators with the 
individual atoms in a crystal. The atoms are bound to one another by 
spring-like forces that produce collective oscillations in the crystal lattice. 
Assuming the forces to be linear in the relative displacements, one can 
set up the equations of motion of the entire ensemble of atoms, which can 
be transformed to a simple form in terms of a set of what are referred to 
as normal mode co-ordinates, there being a total of 3N number of these. 
The normal mode co-ordinates typically are linear combinations of the 
atomic co-ordinates, i.e., are collective ones. The equation of motion of a 
normal mode co-ordinate is of the same form as that of a simple harmonic 
oscillator with some frequency w which, however varies from one normal 
mode to another. It is this spectrum of frequencies that is ignored in 


arriving at the Einstein formula (3-96). 


The normal modes are in the nature of waves set up in the crystal, with 
each of these being characterized by a wave vector k and a frequency w 
where the latter depends on k along with a discrete index, say, s. The 
components of the wave vector vary over a range depending on the crys- 
tal structure that determines a certain region in the k-space referred to 
as the Brillouin zone. The discrete index s also comes in due to the crystal 
structure, there being 3p number of indices, where p stands for the num- 
ber of atoms constituting the basis in the primitive unit cell of the Bravais 
lattice underlying the crystal structure. Each of the 3p number of indices 
corresponds to a certain functional dependence of w on the wave vector 


k that can be represented graphically by a w-k dispersion relation within 
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the Brillouin zone. There are a total of 3p such branches of dispersion re- 
lation of which 3 are referred to as acoustic branches since the frequency 
w goes to zero as k — 0 for each of these branches. The remaining 3(p — 1) 
of the branches correspond to optical modes, for each of which w remains 
finite as mmk — 0. At low temperatures only the acoustic modes are ex- 
cited since the the optical modes correspond to excitations higher up in 


the energy scale. 


The contribution of lattice vibrations to the specific heat at sufficiently 
low temperatures is thus explained to a good degree of approximation in 
terms of the acoustic modes alone (the optical modes being ‘frozen in’ 
in their ground states), where a normal mode with a given value of k 
(and of s) can be looked upon as a quantum mechanical oscillator with 
frequency u,,,, whose mean energy at temperature T can be obtained from 


the partition function of the form (2-28). 


At suffificiently low temperatures, each of the three acoustic modes can 
be looked upon as a standing wave in a continuous medium within the 


volume (V) of the crystal. Keeping in mind the additivity principle of the 


_ dlnZ 


internal energy U = —“5; 


(see note following (2-28)), one has to sum over 
the logarithm of the partition function of all the possible acoustic modes 
with a specified value of w,,, within some small range and finally sum over 
all these small frequency ranges. Assuming for the sake of simplicity that 
the w—k relation is isotropic in the k-space (which means that the crystal 


behaves like an isotropic medium in which the waves are set up), the 


number of modes within a frequency range w to w+dw for any of the three 
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acoustic branches appears in the form [85], 


Vg 


g(w)dw = 523 du), (3-97) 


where c = lim, 59 i represents the speed of the wave representing a nor- 
mal mode. In other words, the dispersion relation characterizing the 
waves of lattice vibration is, to a good degree of approximation, of the 


form 
w = clk. (3-98) 


Of the three acoustic branches, two correspond to transverse and one to 


longitudinal waves characterized by speeds, say, «¢ and cq. 


1. The waves under consideration depend to some extent on the boundary 
conditions that these are required to satisfy. In the case of periodic 
boundary conditions these are in the nature of traveling waves with a 
speed c while in the case of fixed boundary conditions, these appear 
as standing waves generated by superpositions of oppositely directed 
traveling waves, each of speed c. The density of modes g(w) turns out to 
be independent of the boundary conditions in the thermodynamic limit 
V — oo, as it should, since it determines the macroscopic properties of 


the crystal. 


2. The identification of acoustic modes as ‘transverse’ and ‘longitudinal’ 
ones with speeds %,c is valid only for simple crystal structures for 


which the waves can be assumed to be set up in an isotropic medium. 
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With all these considerations in place, the logarithm of the canonical par- 
tition function for all the the vibrational modes taken together is obtained 


as (refer to (2-28)) 
1 2 lied 
liz = —(S + =) do? In ——— (3-99a) 


In this expression, the numerator in the logarithm on the right hand side 
gives a contribution independent of the temperature to the mean energy 
(U = =). representing the ground state energy (say, U,) of the in which 
all the normal modes are in their ground states (check this out). One can 


thus write 


aoe oie 1 
InZ ==8U,4 aaa =) dww In aia (3-99b) 


Here, the lower limit of integration on the right hand side is zero, corre- 
sponding to the fact that the frequency of each of the acoustic modes goes 
to zero for the wave vector k going to zero. The upper limit, on the other 
hand, has been taken to be infinity by way of adopting a simplification: 
since the integrand becomes exponentially small for hw >> 3~', one can, 
for sufficiently low T, integrate over the entire k-space (i.e., up to w > oo) 
instead of integrating only over the Brillouin zone. The internal energy is 
then obtained as 


BV fe aplt De 
U=Ust sagas f es hes [c. = (36a + @)) iP (3-99c) 


(check this out) where c, denotes an average speed of the acoustic waves 


in the crystal, assumed to behave like an isotropic continuous medium. 
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This result could be obtained more directly by making use of the equipar- 

tition principle for each mode and then integrating over all possible modes; 

try it out 
Inserting the known value of the integral on the right hand side (= a) 
and taking the derivative with respect to temperature, we finally arrive at 
the following expression for the vibrational specific heat of the crystalline 
solid, 


_ ankbT? 


Cy = ans (3-100) 


This gives a T?-dependence of the specific heat at low temperatures, in 
contrast to what is implied by the Einstein formula (3-96). The ‘T? law’ 
has been verified experimentally down to the lowest attainable tempera- 


tures. 


While the Dulong-Petit law of specific heats of monatomic solids holds in 
the high temperature regime and the T? law holds at very low temper- 
atures, Debye worked out a formula that interpolates to moderately low 
temperatures. This formula is obtained by observing that the total num- 
ber of acoustic modes is given by the density of modes integrated over w 


and summed over the three acoustic branches, i.e., by 4; f w?dw where 


the lower limit of integration is 0, but the upper limit cannot be taken to 
be oo, contrary to what was done in (3-99b), since in the present case the 
integral diverges at infinity. The total number of acoustic modes has to 
be 3N, and a divergence in the integral is indicative of the fact that there 


has to be an upper cut-off frequency to account for the finiteness of the 
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number of modes. We introduce the cut-off frequency wp, referred to as 
the Debye frequency, assuming it to be the same for all the three acoustic 
branches for the sake of simplicity. One thereby obtains 


3 

_ Vw 
_ 273° 
2m" ce 


3N (3-101) 


Knowing c, (from values of q,c¢), one can obtain wp form this formula, 


which then gives the Debye formula 


T \3,,Op 
=SoNEs( Vy -102 
Cy = 36 a(S-) Ca); (3-102a) 
where 
_ hwp 
00 ie’ 
UW ¢ 
cu) = f dx. (3-102b) 
« Hl 


At vanishingly small temperatures, Se goes to oo, and one recovers the 


Debye T?-law (3-100). 


3.4.1.3. The ‘phonon gas’ 


In sec. 3.4.1.2 above, the statistical mechanics of the vibrating lattice is 
described by looking at it as a system made up of the normal modes as 
constitutes, where the Hamiltonian is expressed as the sum of so many 
harmonic oscillator Hamiltonians corresponding to the normal modes. 


We now turn to an alternative and equivalent description in terms of the 
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‘phonon gas’. 


The possible quantum states corresponding to any specific normal mode 
(with wave vector k, and branch index s) are just the states of an oscilla- 
tor (whose frequency we denote by, say, w,,). If n,, denotes the quantum 
number corresponding to a typical state of this oscillator, then the con- 
tribution of the mode under consideration to the energy of the vibrating 


lattice is 


1 
E (ns) = hires (Ms + 5) (3-103) 


This formula leads to an alternative and convenient description of the 
vibrating lattice in which the lattice energy is looked upon as the energy 


of so many hypothetical ‘particles’ called phonons. 


We will, for the sake of simplicity, disregard the zero point energies of 
the modes. More precisely, one notes that whatever be the vibrational 
state of the lattice, the total zero point energy 2 shite is always there 
and can be taken as the zero on a shifted Seay scale so that, in this 
scale the contribution of the ks mode to the energy is just ny,,fw,,. One 
then expresses the same thing in the alternative language by saying that 


it is the energy of n,, number of phonons, each of energy fu, associated 


with the mode ks . 


In order that this language makes sense, it should be possible to asso- 
ciate with a phonon some definite momentum as well, and it turns out 


that this precisely is the case. 
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Thus, a phonon with energy fu,,, turns out to be associated with a mo- 
mentum fk, and this amount of momentum appears or disappears from 
the vibrating crystal when such a phonon is created or destroyed through 
some interactions, conserving the total momentum of the crystal plus the 


external system interacting with it 


Remember that all this could also be expressed in the language we started 
with, namely the description in terms of modes of vibration ; for instance, 
the ‘creation’ of a phonon with energy fu, is nothing but a change in the 
state of the lattice wherein the oscillator quantum number of the ks mode 


increases by unity. 


In this new language of description, then, the energy of the crystal lattice 
is just the sum of energies of phonons, each phonon being in the ‘state’ 
corresponding to some mode ks. The phonons are thus non-interacting 
particles and the statistical mechanics of the vibrating lattice is just the 


statistical mechanics of an ‘ideal gas’ of phonons. 


In reality, though, there is a bit of simplification involved here since the 
Hamiltonian of the system can be expressed as a sum of harmonic con- 
tributions from the normal modes only approximately. A more accurate 
description requires additional terms to be taken into account in the Hamil- 
tonian which are collectively called the anharmonic terms. In the descrip- 
tion involving phonons these additional terms imply an interaction among 
the phonons. However, these interactions happen to be almost non-existent 
for a wide range of situations of practical interest, especially at low temper- 


atures. 
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In summary, we have two equivalent descriptions of the vibrating crys- 
tal lattice: the oscillator description or description in terms of normal 
modes, and the phonon description; in the latter language, the mode that 
a phonon is associated with may be thought of as specifying the ‘state’ of 


the phonon. 


In the phonon language the crystal wavefunction is completely specified 
once one knows how many phonons are associated with each normal 
mode. This is equivalent to specifying n,,, the quantum number of the 
‘state’ ks for all k and s. Since the quantum number of an oscillator can be 
any arbitrary non-negative integer, there is no restriction whatever on the 
number of phonons that can occupy any given state. In addition, there is 
no distinguishing label associated with the phonons since only the num- 
ber of phonons in any given state is relevant in describing the state of the 
crystal as a whole. In other words, the phonons constitute an ‘ideal gas’ 
of bosons. This is how, the concept of an ideal gas — seemingly an ideal- 
ized one — assumes relevance for a system with interacting constituents. 
While the atoms in the crystal all interact with one another, their states 
of oscillation can be described, in the harmonic approximation, in terms 


of non-interacting phonons. 


One other fact of importance is that the BE gas of phonons is character- 
ized by a chemical potential 4, = 0, since the number of phonons is not 
subject to any conservation principle. Going back to the case of an ideal 
gas of non-interacting atoms, the number of atoms in an isolated sample 
of the gas is conserved and has a well defined value — the number be- 


comes variable only when the gas interacts with an external system (such 
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as a ‘particle reservoir’) and exchanges the atoms with the latter. Suppos- 
ing the gas to be coupled to a heat reservoir but not to a particle reservoir, 
there occurs a fluctuation in its energy but not in its particle number, 
since the particle number is an independently conserved quantity. In the 
case of a phonon gas, on the other hand, a heat reservoir is automati- 
cally a ‘phonon reservoir’ too, since a fluctuation in energy is attended 
by a fluctuation in the phonon number as well, caused by the creation 
and annihilation of phonons in the various modes. This is expressed by 
saying that the phonons are ‘quasi-particles’ rather than particles as in 
the case of atoms in a gas — the quasi-particles describe states of an 
underlying system, can be created and annihilated (even with an arbi- 
trarily small energy change), and have attributes like particles (such as 


momentum, with rest mass zero). 


Based on these considerations, one can now refer to (3-52b) to obtain 
the contribution of the vibrational modes to the internal energy of a crys- 
talline solid, looked at as a phonon gas at any given temperature 7, where 
the single-particle states (corresponding to the index a in this expression) 
in the phonon picture are represented by the modes described by the 
prameters k,s. Making use of the continuum approximation in the limit 
of large volume (V), and of the density of modes g(w) (refer to (8-97)) we 


obtain 


3 
Vif? hw 
‘ao Ta / ta (3-104a) 


where the summation is over the three acoustic modes that one needs 


to consider in the low temperature limit, the speed corresponding to any 
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one of these modes being denoted by c, (s = 1,2,3). Recalling that, under 
the assumption of the solid behaving like a continuous isotropic medium, 
cs = ¢ for two of the branches and c, = c, for the remaining branch, one 
obtains, as in sec. 3.4.1.2, 


8V. Of? hw? 


on? Jo @P — 1 


re (3-104b) 


where c, is the average speed, as indicated in (3-99c). On substituting 
Ghw — x, one obtains for U the same expression for U as in (3-99c), with 
the ground-state energy (U,) left out since it does not contribute to the 
specific heat. Taking the derivative with respect to T, one recovers the 


Debye T?-law (3-100). at low temperatures. 


Equations (3-99c) and (3-104b), like other formulas for the thermodynamic 
functions, are to be interpreted as leading approximations for large V; strictly 
speaking, one needs corrections depending on such factors as the shape 
and size of the sample of crystalline material — the ‘non-thermodynamic 
variables’ mentioned in sec. 2.1.3. Throughout this book we ignore such 
non-leading terms, with the understanding that only the leading terms re- 


main relevant in the thermodynamic limit. 


The specific heat constitutes an important instance of a static response 
function of a thermodynamic system. Generally speaking, thermody- 
namic systems in real life are complex ones, and a proper understanding 
of their static and dynamic response to perturbations involves a diverse 
multitude of considerations relating to contributions from numerous fac- 


tors characterizing any particular system. For instance, the specific heat, 
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which relates the change in temperature (the ‘response’) with the change 
in internal energy (the ‘cause’; under the condition of a constant volume 
of a fluid, this equals the heat added to it). As we have seen in the present 
section, the specific heat of a crystalline solid at low temperatures is ac- 
counted for by the energy associated with lattice vibrations. Additionally, 
the specific heat may receive contributions from other ‘modes’ of energy 
sharing, an instance of which will be found in sec. 3.4.3, where we will 


look at the ‘free electron gas’ in crystalline conductors. 


3.4.2 Black-body radiation: the Planck formula 


One of the most celebrated coups in physics was Planck’s explanation of 


the energy distribution in black-body radiation. 


The term black-body radiation denotes electromagnetic radiation in equi- 
librium at some given temperature 7. The equilibrium is effected by con- 
fining the radiation in a material enclosure maintained at the given tem- 


perature, the enclosed volume (V) being sufficiently large. 


This was initially thought to be a completely classical system, the thermo- 
dynamics of black-body radiation having been well-known. However, the 
energy distribution in the black-body radiation remained unexplained till 
Planck wrote down his formula that we are now going to have a look at. 
The derivation of the formula makes use of the quantum nature of black- 
body radiation and runs parallel to that of the Debye T?-law worked out 


in sec. 3.4.1. 


The electric and magnetic field components in the black-body radiation 
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satisfy the wave equation inside the enclosure, together with a set of 
boundary conditions at the enclosing surface. As a result, the electro- 
magnetic field components can be written down as a linear superposition 
of a set of basic solutions, the latter being in the nature of standing waves 
in the enclosure. These standing waves are the normal modes of the sys- 
tem since each is associated with some specific frequency. When the 
volume of the enclosure is large enough, one expects that the exact form 
of the boundary conditions would not matter much so far as bulk prop- 
erties of the system are concerned. Thus, we can choose the boundary 
conditions, including the shape of the enclosure according to our conve- 


nience and still arrive at meaningful results. 


The field components behave like standing waves in the case of fixed bound- 
ary conditions. Under periodic boundary conditions, on the other hand, the 
field components behave like progressive waves. The standing waves are 


produced by superposition of oppositely directed progressive waves. 


A convenient choice is a cubical enclosure of side, say L, together with 
the boundary condition that the field component under consideration 
vanishes at the boundary of the cube (the ‘fixed’ boundary conditions). 
Choosing the coordinate axes parallel to the edges of the cube, each nor- 
mal mode is then characterised by some specific wave vector k with com- 
ponents #(n1,n2,n3), with the mode numbers n;,n2,n3 being positive in- 
tegers. The relation between the frequency w and wave vector k is the 


familiar 
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i= ch, (3-105) 


c being the velocity of propagation of electromagnetic radiation. Finally, 
the radiation field is characterised by the property that there corresponds 
two independent modes for any given wave vector k namely, say, the left- 


handed and right-handed circularly polarised waves. 


This last fact gives rise to a factor of 2 in the expression for the density of 
modes g(w), which would otherwise be exactly the same as that in (3-97) , 
since the latter was derived on precisely the same premises, and with the 
dispersion relation (3-98) (with c now standing for the velocity of electro- 
magnetic radiation). In other words, the density of modes in black-body 
radiation is 


Vo 


g(w) = ue, (3-106) 


T2203 


Note that, as expected, the spatial dimensions of the enclosure enters 
into the expression for the density of modes only through the volume of 


the enclosure and not, for instance, through its surface area. 


Analogous to the vibrating lattice, each normal mode appears as a har- 
monic oscillator in the quantum description of the black-body radiation, 
for which the Hamiltonian assumes the form of a sum over the harmonic 
oscillator Hamiltonians. In other words, the normal modes can be looked 


upon as independent and distinct constituents, all in thermal equilib- 
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rium at temperature 7, and hence the assembly of normal modes can be 


represented by a canonical ensemble. 


The mean energy of a typical normal mode of frequency w is obtained 


from (2-28) by differentiation ((£) = ae as 


hw 


C2) Sena 1. (3-107) 


where the ground state energy ($hw) of the oscillator has been left out as 


being devoid of thermodynamic significance. 


On multiplying this expression with g(w) given by (3-106) and dividing by 
V, we obtain the mean energy per unit volume of the Black-body radiation 


in the frequency interval from w to w+dw. Denoting this by u,,dw, we obtain 


il fu? 
= eT (3-108) 


on m2c3 ebhw — 1? 
which is Planck’s formula for black-body radiation. 


The internal energy of the black-body radiation is obtained on multiply- 
ing (8-108) back with V and then integrating over w from 0 to oo. This 
gives, finally, the specific heat at constant volume of the system upon 
differentiation by the temperature 7, 


_ An kpV 73 


Cv = seal” (3-109) 


which shows a T°-dependence once again. Evidently, the statistical me- 


chanics and thermodynamics of the vibrational modes in a crystalline 
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solid and those relating to the electromagnetic modes making up black- 


body radiation are closely analogous. 


The analogy between the vibrating lattice and black-body radiation can 
be extended further to the point of employing an alternative language in 
the description of black-body radiation. In this language, instead of the 
normal modes of electromagnetic radiation in an enclosure, one talks of 
an assembly of non-interacting photons. Thus, if the oscillator quantum 
number associated with a normal mode with frequency w and wave vector 
k be r, then we can say that there are r number of photons in a ‘state’ 
k,s with energy hu,,, where s = 1,2 specifies the polarization. A photon 
corresponding to the wave vector k possesses a momentum /ik. All this 
is, of course, entirely analogous to the description of lattice vibrations in 
terms of phonons. 

The dispersion relation between the energy and the wave vector of 
a photon is given by (3-105), but now it is an exact relation for all fre- 
quencies and wave vectors, without its validity being limited to just the 
low-frequency regime or to some specific branch as in the case of the 


phonons. 


The de Broglie relations between energy and frequency on the one hand 
and between momentum and wave vector on the other, tells us that (4-115) 
corresponds just to the relativistic energy-momentum relation for a particle 
with rest mass zero. The same cannot, however, be said of phonons for 


which (3-98) is valid only in the low-frequency limit. 


There are other, more subtle, points of difference between the concepts 
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underlying photons and phonons. However, these are largely irrelevant 


in the present context of statistical mechanics of phonons and photons. 


Following the alternative description in terms of photons, the black body 
radiation may be looked upon as an assembly of non-interacting photons 
and, in close analogy to the assembly of phonons, the chemical potential 


of this assembly is zero. 


Incidentally, the statement that the photons are mutually non-interacting 
is, according to known facts, an exact one, compared to the approximate 


nature of the corresponding statement for phonons. 


The statistical mechanics of black-body radiation thus reduces to that 
of an ideal gas of photons. On invoking (3-47) to obtain an expression 
for the mean occupation number of photons in a ‘single-particle state’ ks 
(with = 0) and making use of the density of states as in sec. 3.4.1.3, 
one again arrives at the Planck formula (3-108) and the formula for the 
specific heat of the black-body radiation (3-109) (check this out; refer 
to [85], chapter 4). Other thermodynamic properties too can be obtained 


likewise. 


3.4.3 The ‘free electron gas’ in conductors 


Solids and liquids are condensed phases of matter, where the constituent 
particles interact strongly. In particular, a solid is a tightly held mate- 
rial where an enormous number of constituent particles are involved in a 


complex state of mutual correlation. Nevertheless, there exist ingenious 
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approximation schemes, principally based on the adiabatic approxima- 
tion and the spatially periodic structure of a solid, in which the complex 
problem of explaining its behavior under various types of perturbations 
can be reduced to relatively simple problems where the basic principles 
of statistical mechanics enunciated in earlier sections of this book can be 


applied so as to arrive at meaningful results. 


These principles include those relating to the equilibrium ensembles of sta- 
tistical mechanics, to the ones relating to systems whose Hamiltonians can 
be expressed as sums of independent but distinct components, and, finally, 


to ones applying to non-interacting, but indistinguishable, components. 


One such reduction concerns the electrons in a crystalline material where 
these in a certain approximate sense can be described as independent 
ones in a periodic potential, their behavior being analogous to free parti- 


cles obeying FD statistics. 


An electron in a solid experiences the field due the nuclei (or the ionic cores 
made up of the nuclei and the tightly bound electrons), and due to all the 
other electrons in the material. Here we ignore the interaction energy be- 
tween the nuclei in the Hamiltonian of the system, assuming these to be 
fixed in their respective positions in the crystal lattice. The interaction be- 
tween the electron and the nuclei introduces no complication in principle 
since it gives rise to a single-particle Schrédinger equation independently of 
the other electrons. As for the Coulomb interaction between the electrons, 
it can be assumed to give rise to a mean field experienced by an electron 
so that, all things considered, one can again reduce the problem effectively 


to a single-particle one. The price to pay is that the mean field is to be 
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a self-consistent one, since it is determined by the mean electron density 
which, in turn, depends on the solution to the Schrdinger equation that it 
gives rise to. Assuming that a self-consistent mean field has been arrived 
at and that the effects of exchange symmetry between electrons has been 
properly taken into account [3], the problem ultimately reduces, in a quali- 
fied sense, to obtaining the eigenvalues and eigenvectors of a single-electron 


Schrédinger equation. 


The effective potential in the one-electron Schrédinger equation shares 
the periodicity of the underlying lattice, as a result of which the eigen- 
values — densely packed in the energy scale because of the macroscopic 
volume within which the electron moves — turn out to be grouped into 
energy bands, with gaps in between (some of the bands overlap, with no 
gap left between them). At zero temperature, the energy levels within 
the bands are filled up from bottom upward in accordance with the Pauli 
principle, so that a number of bands are completely filled up, with the 


last occupied band either partly or completely filled. 


In the case of metallic conductors, the last occupied band (i.e., the one 
highest along the energy scale) is partly filled up. For a large class of 
materials, the energy levels in this band, relative to its bottom, can be 
expressed in the form 


h?k2 


2Merr , 


€k (3- 1 10) 


where k is a wave vector and m.¢ is referred to as the effective mass of the 


electron. 
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Formula (3-110) involves a further simplification where we assume that the 
dispersion relation (relating the energy with the wave vector) is an isotropic 
one so that the effective mass is a scalar. Though the underlying Schrédinger 
equation formally applies to a single electron, the latter is actually a quasi- 
particle, i.e., an entity that incorporates the effect of all the other electrons 
in the system. In other words, it is a ‘dressed’ rather than the ‘bare’ electron. 


The effective mass expresses the result of this ‘dressing’. 


The wave vector k carries the imprint of the lattice structure and has the 
periodicity of the reciprocal lattice, as a result of which hk is in the nature 


of a ‘quasi-momentum’ rather than a momentum. 


These few points of difference notwithstanding, the formula (3-110) is 
analogous to the expression for the energy of a free particle moving within 
some large volume V (the volume of the conducting material), where the 
components of p = hk can be assumed to be continuous variables, and 
the results in sec. 3.3.6 can be applied, while keeping in mind the present 


context. 


Thus, at 7 = 0 the ‘free’ electrons in the conduction band fill up all the 
energy levels from the bottom up, up to the Fermi level ¢; related to their 


number density (n) as 
)s——ni, (3-11 1a) 


(refer to (3-75); g = 2 for the two spin states of the electron), corresponding 
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to a Fermi temperature 
€F 3 2 h? 2 
Tr = =(-)3 ns. (3-111b) 
kp T Merkp 
and a Fermi momentum 
Re pons a 
pr = V2mener = 5(—)*. (3-1116) 
T 


For a considerably large class of materials, the Fermi energy turns out 
to be of the order of a few eV, and hence ‘nt = = is of the order of = 
even at JT ~ 100K, as a result of which the electrons in the conduction 
band behave like a degenerate FD gas (sec. 3.3.6.2). In other words, 
at ordinary temperatures, only a small fraction of the electrons close to 
the Fermi level can be excited to higher energies while all the electrons 
lower down in the energy scale remain unaffected when a small amount 


of energy is added to the conducting material (refer to fig. 3-4). 


Referring to formulae (3-76a), (3-77b), one finds the specific heat of the 


‘free electron gas’ to be given by the expression 


a T 


One can now compare the contributions of the lattice vibrations and the 
‘free’ electrons to the specific heat of a crystalline conductor at ordinary 


temperatures. 


In contrast to the electron gas, the classical approximation can be em- 


254 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


ployed to describe the vibrations of N number of atoms (or, more pre- 
cisely, the ionic cores) making up the crystal structure, yielding the result 
that the mean energy of the assembly is 3NkpT (recall that the Hamil- 
tonian of a 3D harmonic oscillator is made up of 6 additive quadratic 
terms). In comparison, the internal energy of the electron gas (a strongly 
degenerate FD gas at ordinary temperatures) is, given by ™N kez which 
is ~ 100 times larger. However, the specific heat of the two systems com- 
pare in a contrary manner: the specific heat of the electron gas is smaller 
by a factor ~ 100 since its response to an increase in internal energy in- 
volves only those electrons which occupy an energy interval ~ kpT near 
the Fermi level. In summary, the electron gas and the vibrating ions may 
be looked upon, in an approximate sense, as two distinct systems obeying 


completely different statistical distributions at ordinary temperatures. 


We will not continue with further applications of the basic principles of 
equilibrium statistical mechanics to simple systems of physical interest 
or to ones whose description can be reduced to relatively simple terms by 
making appropriate approximations (for a few more illustrative examples, 


see [85]). 


In the remaining section of this chapter we look at another ‘application’ 
of the basic principles of equilibrium statistical mechanics, namely, the 
derivation of the probability distribution in the canonical ensemble from 


that in the microcanonical ensemble. 


255 


CHAPTER 3. THE IDEAL GAS IN THE CLASSICAL AND QUANTUM 
THEORIES 


3.4.4 Digression: the canonical ensemble from the mi- 


crocanonical 


In this section we consider another ‘application’ of the equilibrium en- 
sembles of statistical mechanics, where the canonical distribution for a 
system (S) is seen to emerge within the framework of the microcanonical 
ensemble for a bigger system. We assume that S interacts with a bigger 
system (R, to be referred to as the ‘reservoir’), the two together compris- 
ing an isolated composite system C in equilibrium with a specified energy 
U at a temperature 7’, which is the common temperature characterizing 


all the three systems (S,7?, C). 


The thermodynamic limit will, in the end, be assumed to apply to the 
systems R,C, while S may be a small system. The state of the closed 
system C is represented by a microcanonical ensemble with total energy 
U, while that of S, the system under consideration, will be shown to be 
described by a canonical distribution. The energy of S is a fluctuating 


quantity since there occurs exchange of energy between S and R. 


We begin by considering a stationary state, say, 


v.), of S, with energy « 
(there may be other independent stationary states with the same energy 
in virtue of degeneracy) and set out to determine its probability of occur- 
rence in the ensemble representing the state of equilibrium of S within 


Ce 


Corresponding to the specified microstate |.) of S, R can be in any one 


of its stationary states with energy U —«, with equal weight applying to all 
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these. Let the number of microstates of R with energy, say E, be W(E), 


and the corresponding number for C be W(£). 


Assuming that the stationary states of C can be expressed as product 
states with factors belonging to S and R, it follows that, corresponding 
to the pure state |v.) of S, C can also be in any one of its own stationary 
states numbering W(U — ¢«), with equal weights to all these. All these 
states of C form a subset of the total number of its stationary states at 


energy U, which is W(U), all having equal weight. 


We assume here that the interaction between S and FR is a sufficiently weak 
one, so that a stationary state of C can be expressed as a product of sta- 
tionary states of the two, the energy of the product state being the sum of 
energies of the factor states. Though this is a special assumption repre- 
senting an idealization, it ha no bearing on the final result regarding the 
probability distribution of stationary states of S at any specified tempera- 
ture T. All that matters regarding the final result is that there has to be 
exchange of energy between S and R. That the interaction between the 
two is of vanishingly small strength, is an artifact aimed at simplifying the 


derivation of the required result. 


According to the Boltzmann formula (2-2), one has, 


W(U —) = exp [kg 'Sr(U — €)], (3-113a) 
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and 


W (U) = exp [kg'Sc(U)]. (3-113b) 


Since the stationary state |7).) (of energy ¢) of S occurs in W(U —«) number 
of microstates of C among a total of W(U) number of microstates, all of 
which have equal weight, the probability of occurrence of this microstate 
of S in the ensemble representing its macrostate under the given condi- 


tions is obtained as (refer to [15], where the notation differs) 


W(U —.«) 
W(U) 
exp [kg 'Sr(U — €)| 


~ “exp[igtseuy N48) 


[probability of state with energy «] p. = 


It is important to note that the above expression gives the probability of 
a state of energy « in the ensemble representing its macrostate, and not 
the probability for its energy to be «. If the degeneracy of the energy level 


€ be g, hen the latter probability is seen to be 


[probability of energy €] p(€) = gpe. (3-114b) 


The ensemble representing the macrostate pertaining to the formula (3-1 14a) 
corresponds to the constraint that the system S is in contact with a reser- 
voir at temperature 7’, while exchanging energy with the latter. Assuming 
that the volume (V) and number of particles (/V) in it are given (recall that 
the prototype system considered in most contexts in this book is a simple 


fluid), one recognizes that (3-114a) actually corresponds to a canonical 
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ensemble. 


It now remains to cast (3-114a) in more familiar form. We first observe 
that Sc(U) can be treated as a constant in the context of the probability 


distribution pertaining to stationary states of S, and write 
De = Cexp [kg'Sr(U — ©)], (3-115) 


where C’ is a constant. We now take the logarithm of both sides and 
assume that, the reservoir R being a large system as compared with S, 
the mean energy (é) of S is small compared to U and that, moreover, the 
range of fluctuation of « is also small (compared, once again, to U). One 
can then expand Sr(U — «) around Sr(U), upto the term linear in U — «, 


obtaining 


np. =nC + kg'SR(U) - é, (3-116a) 


kpT 


where we have made use of the fact that, R being in a macrostate de- 
scribed by the microcanonical ensemble at internal energy U —¢€ = U (this 
approximation is permitted since, in the third term on the right hand side 


of the above equation we already have the small factor «) 


OSR(E) 


ays lew = (3-116b) 


Since the second term on the right hand side of (3-116a) is once again a 


constant in respect of the probability distribution of the stationary states 
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of specified energies of S, we can write 
P= ze, (3-117) 


where ; is a normalization constant and 6 = co 


The normalization condition }°. p. = 1 gives 
Z=Sve*., (3-117b) 


We thereby arrive at the canonical distribution formula (2-15a), allowing 
for a difference in notation. One can extend the derivation to other en- 
sembles such as the grand canonical ensemble or the pressure ensemble 


(see [66], chapter 2, section 13). 


Though we have referred to stationary states with sharply defined ener- 
gies in the above derivation, it is often more appropriate to talk in terms 
of states belonging to small energy intervals, bringing in the density of 
states at various energies. The final result, however, remains unchanged 
provided p, is used to denote the probability density of a state at energy 


e, and Z is defined in terms of an integral rather than a sum. 


An analogous derivation holds in the classical context as well ([67], chap- 


ter 7). 


1. While we have derived the canonical distribution formula from the mi- 
crocanonical distribution in a bigger system, a more direct derivation is 
possible, based on the Maxwell-Boltzmann distribution formula (refer 


to sections 3.3.4, 3.3.5) for the classical ideal gas. Indeed, considering 
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a given system S in equilibrium with a reservoir R at a temperature 
T, one can imagine that R is made up of a large number of copies 
of S itself, in which case the entire system made up of S and R can 
be looked upon as an ‘ideal gas’ made up of so many copies of S (the 
‘molecules’ making up the gas) with an interaction of vanishingly small 
strength among themselves. The copies of S can be considered one at 
a time and are thus identical but distinguishable, which implies that 


the classical energy distribution formula (3-58b) applies. 


This gives the probability distribution formula for the system S in the 
canonical distribution (see [67], chapter 9, [85], appendix to chapter 3). 
In other words, though the canonical probability distribution formula 
and the Maxwell-Boltzman energy distribution formula for the classical 
ideal gas, pertain to entirely different contexts, one may, in a sense, be 


looked upon as a consequene of the other. 


This direct derivation of the canonical distribution formula works for 


both classical and quantum systems. 


2. Referring to sec. 3.3.5, we recall that the derivation of the Maxwell- 
Boltzmann distribution formula for the classical ideal gas within the 
framework of the microcanonical ensemble requires the application 
of the Stirling approximation in order to work out the most probable 
distribution among the single-particle states of the gas. One can, from 
the mathematical point of view, adopt a more rigorous approach and 
invoke the Darwin-Fowler method ([67], chapter 9) so as to work out 
the mean occupation numbers of the single-particle states. In addition 
to obtaining the mean occupation numbers in the thermodynamic limit 
(with reference to the fictitious system made up of copies of the system 
of interest(S)), the Darwin-Fowler method yields the result that the 
fluctuations in the occupation numbers also go to zero in this limit. 


Once again, the method works for regardless of whether the system S 
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is a classical or a quantum mechanical one. 


3. The derivations of the canonical from the microcanonical ensemble 
presented in this chapter notwithstanding, there remains the ques- 
tion as to how the microcanonical ensemble arises in the first place 
for an isolated system. We address this question in the setting pro- 
vided by the eigenstate thermalization hypothesis (ETH) in chapter10, 
where, additionally, we inquire as to how the canonical distribution 
can emerg for a subsystem under conditions more general than the 


ones pertaining to the microcanonical ensemble. 
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Chapter 4 


Statistical Mechanics of 


Interacting Systems I 


4.1 Equation of state for a dilute gas: the clas- 


sical virial expansion 


4.1.1 Virial expansion: introduction 


The classical ideal gas is described by any of the following thee funda- 
mental formulas - all valid in the thermodynamic limit - for the ther- 
modynamic potentials (refer back to section 3.2, where these formulas 
were derived, in the context of the three basic classical ensembles; recall 
that \;, the thermal de Broglie wavelength at temperature T, is given by 
Ap = ——__) 


= T 
(2amkpT) 2 


(4-la) 
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7 

F = —Nkp[ln (ng) + 1], (4-1b) 
Vy 

= “hel yy (4-1c) 


in a notation by now familiar. 


Recall, in this context, that 7 stands for the fugacity (y = e°“), also 
referred to as the ‘absolute activity’, while 5, which is a parameter 
T 


of relevance in the virial expansion, is termed the ‘activity’. 


Starting from any one of these, one can derive further thermodynamic 
functions by differentiation, and then obtain equations of state for the 
gas. For instance, one obtains, from (4-1c), 

—0Q ay 


p=—a, = keP 


OQ Vy 
A N —— ee a4 (4- 1 d) 
As, Om Ae 


from which we obtain the following equation of state for the classical ideal 


gas: 
a (4-1e) 


where p stands for the mean particle number density x (this was previ- 


ously denoted by the symbol 7.) 


A real gas is found to deviate from this equation of state, the deviation 


being progressively more pronounced at relatively high pressures and low 
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temperatures, i.e., for relatively high values of p. As already mentioned, 
the deviation occurs on two counts - first, due to quantum effects, and 
secondly, due to interactions among the particles (commonly referred to 
as ‘molecules’) making up the gas. For relatively low values of p, corre- 
sponding to dilute gases, both the effects are small and, as mentioned in 
sec. 2.2.4, one can express the equation of state for a real gas in the form 


of a series of the form 


kal p(1+ Bo(T)p + B3(T)p? +--+), (4-2) 


where B,(T), B3(T), ---, are referred to as the second, third, ---, virial coef- 
ficients respectively. These are temperature dependent functions, specific 
for a gas, and determine its thermodynamic behavior (the /first virial coef- 
ficient is, by definition, unity for a dilute gas). We will, for the time being, 
ignore quantum effects and look at the classical theory for the determina- 
tion of these coefficients, relating these to the intermolecular interactions 
in the gas, describing the thermodynamic behavior of the latter in the 
grand canonical ensemble. Quantum effects for an ideal gas have already 
been considered in sec. 3.3. The quantum mechanical theory underlying 
the determination of the virial coefficients will be briefly considered in 
sections 4.2 and 4.3. Sec. 4.4 will be devoted to the summing up of all 


these different considerations relating to dilute gases. 


4.1.2 Virial expansion: cumulants and clusters 


From a fundamental point of view, the virial coefficients are all deter- 


mined by the intermolecular potential energy function (refer to eq. (1-7)) 
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which, for a system of N particles confined within a volume V, depends 
on the phase space co-ordinates qi, q@2,--- ,qds (s = 3N) of the particles or, in 
other words, on their position vectors rj,r2,--- ,r,) with reference to some 


chosen origin: 


potential energy = ®(r,,ro,--- , ry). (4-3) 


Here we have assumed that the particles are not influenced by any ex- 
ternal field and that, their interactions with the boundary surface of the 
confining volume have no effect on the behavior of the gas in the thermo- 


dynamic limit. 


In the following we will assume that the intermolecular interaction can be 
completely described in terms of a two-body central force, in which case 


one can write 


Ory Pa. Ty) = > o(\ri — ¥5I) =) > d(riy); (4-4) 
ija<s i<j 
where r;; = |r; — r;|, and ¢(r) stands for the two-body central potential 


which, in the present context, is the ultimate object of interest. For- 
mula (4-4) implies that the total potential energy 6 depends additively on 


the two-body potential energies. 


Three-body forces appear to be of relevance in numerous situations of phys- 
ical interest [111]. While we confine our considerations to two-body inter- 
actions, admissible potentials ¢(r) are to satisfy certain conditions in order 
that the virial expansion may converge. For instance, ¢(r) should go to zero 


faster than r~° for r > oo (this rules out the Coulomb interaction, for which 
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one needs to engage in special considerations). Further, we assume that the 
interaction is isotropic (i.e., ¢ does not depend on the orientation of r, the 


vector separation between two molecules). 


The form (4-4) is found to be inadequate in explaining quantitatively the 
third and higher virial coefficients of real gases, since one finds that a com- 
putation based solely on the star diagrams does not reproduce these coeffi- 


cients to a sufficient degree of accuracy. 


The grand canonical partition function is given by (refer to formula (2-96)) 


= . VN‘ ZA(V,T, N), (4-5) 

N=0 
where Z,(V,7, N) stands for the canonical partition function for N parti- 
cles confined in a volume V at temperature 7, with the potential energy 
® as in (4-4) (the reference to volume and temperature will be left implied 
from now on). Referring to (2-66b) and performing the 3-fold momen- 


tum integration, one obtains 


1 
Z(N) =sngr | ores (- Bow )) 
i 
= yng | eresnl (-BS— (te) (4-6) 


t<Jj 


where d@)r stands for the volume element d®)r,;d ry ---d® ry in the 3N- 
dimensional space of the position co-ordinate of the N particles (this has 
been denoted by the symbol dQ in earlier sections), and 6“) stands for 


the potential energy of the system of N particles, given by (4-4). The 
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integral 
Cy= [rex (— 30") = f a rexp(—3 0 (ru), (4-7) 
i<j 


will be referred to as the configuration integral for N particles under con- 
ditions mentioned above. One thus has, for the system under considera- 


tion, 


i nN 
N 


The special case of the ideal gas corresponds to ® = 0, and one immedi- 
ately obtains C(N) = V’, from which follows the expression (4-1c) for the 
grand potential (0 = —67' In Z,). 


In the presence of interactions the object of interest is the configuration 
integral Cy, a knowledge of which is sufficient for setting up the equation 


of state of the gas in the classical description. For this, one notes that 


Cy = / d®)rexp(—B S~ $(riz)) = / deMe |] esa): (4-9a) 


1<j UJit<J 


where, in the second equality, the expressions f;; stand for 
fig =e POO) —1, (4-9b) 


For most cases of interest, the two-body interaction potential ¢(r) is of the 
general form shown in fig. 4-1(A), where it is seen to have a repulsive hard 


core at short distances and an attractive ‘tail’ at relatively large distances, 
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the latter decaying as the mutual separation between two molecules in- 
creases beyond a certain range. Fig. 4-1(B) shows the general nature of 
f(r) = (e-8%”)-1), in which it is seen that f(r) + —1 as r— 0, and f(r) > 0 
as r — co. In other words, f(r) remains bounded in the entire range 
0 <r < o (in contrast to ¢(r)) (additionally, it attains its limiting values 


(for r — 0 and r > co) much faster than ¢), which makes it particularly 


suitable for developing ae in a perturbative series for sufficiently small 
values of p = N. The function f(r) will, at times, be referred to as the 
‘f-function’. 
(7) fr) 
o Ny r r 
NK 
(A) (B) 


Figure 4-1: Depicting the general nature of variation (with intermolecular sepa- 
ration r) of (A) the two-body potential ¢(r) and of (B) f(r) = (e~8%” — 1) for a large 
class of systems of interest; ¢(r) has a repulsive core (¢ — co for r < a in (A)) and 
an attractive tail (¢ negative, going to 0 for r > ro, the range of the interaction); 
f(r), on the other hand, remains finite for all values of r, and is suitable for the 
perturbative evaluation of 2 and the development of xtep in a power series for 


sufficiently small values of p = os 
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4.1.2.1 The \-particle configuration integral: representation by di- 


agrams 


Looking at the formula (4-9a), one can express the N-particle configura- 


tion integral in the form 


Cn = f &™rWreltyte,-t8) (say) 


=f aap + (fia fiat foa +++) + (frofis + frases + +2) +2>* |. (4-10) 


This expression is made up of a groups of terms, of which the first is sim- 
ply unity which, when substituted in the two formulas in (4-8), gives the 
canonical and grand canonical partition functions for the classical ideal 
gas. All the remaining groups result in corrections of various orders over 
the ideal gas formulas. The first of these is a sum of integrals over single 
f-functions, the second one is a similar sum of integrals, but now over 
products of two f-functions, and so on. Since all the particles making up 
the system under consideration are identical, the integrals belonging to 


the former are all equal to 


qi = [rh = VN? fd rdrefia 


ayn-t [ext =Agy [ r f(r)dr, (4-11a) 
0 


(reason this out; there are x = such terms in (4-10)). I write down 


two other integrals belonging to the next group of terms for the sake of 
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concreteness: 


h= [e?rhata = VIL rd rgd rds fiafar (4-11b) 


and 
Le [eM rtohs = VS f dread rs fish (4-11c) 


Other terms in the expansion involve similar integrals over products of 


various different numbers of f-functions. 


Each such integral can be represented by means of a diagram. For in- 
stance, J, of (4-11la) corresponds to the diagram (A) of fig. 4-2, while J, 
and /; correspond to diagrams (B) and (C) in the same figure. In dia- 
gram (A), we have two un-numbered circles connected by a line, corre- 
sponding to the fact that it represents an integral of a single f-function 


N(N-1) 
2 


involving the potential energy of two particles. Since there are pos- 


sible choices for a pair ij, the diagram in (A) occurs with a multiplying 
factor of N=) in Cy. An alternative is to assign two integer numbers 
i,j (1 <i,7 < N, i 4 7) to the two circles in the diagram, without regard 
to which number is assigned to which circle. All such numbered dia- 
grams contribute equally to Cy (reason out why), which is equivalent to 


considering only the un-numbered diagram of fig. 4-2 and multiplying its 


N(N-1) 


contribution by the factor —; 


In a similar vein, the integral J, of (4-11c), involving a product of two 


f-functions carrying different integer indices (pairs 1,2, and 3,4) corre- 
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spond to the diagram (B) in fig. 4-2, where there are four un-numbered 
circles making up two pairs, with the members of each pair joined by a 
line (one thus has two disjoint pairs, corresponding to the fact that the 
f-functions carry distinct pairs of indices; each of the two pairs is a con- 


nected graph as in diagram (A)). The contribution of this diagram is to 


be multiplied with “4 4—2)V—) | corresponding to the various different 
ways of numbering the circles (reason out why; note that an interchange 
of the numbers assigned to the circles in either of the two disjoint parts 


in the figure does not result in a new term in (4-10)). 


In the case of the diagram (C) of fig. 4-2, one index is common in the 
two f-functions, which makes the diagram a connected one, referred as 
a cluster (thus, diagram (A) is a cluster, while diagram (B) is not, being 


made up of two clusters). 


Generalizing, the configuration integral Cy is expressed as a sum of con- 
tributions from all possible distinct k-particle diagrams of the type shown 
in fig. 4-2 with k ranging from 2 to N (k = 1 corresponds to the ideal-gas 
contribution fd°)r = V% to Cy), where the contribution of each such 
diagram appears as a k-fold volume integral, along with a factor V’~* 
and another combinatorial factor specifying the number of topologically 
distinct ways that integer values (ranging from 1 to N) can be assigned 
to the circles in the diagram (refer, for an example, to the box shown in 
fig. 4-2(C), where the two assignments of numbers are not to be counted 
as distinct ones). In the following, we will consider diagrams with integer 
indices assigned to the circles in all possible distinct ways, thereby obvi- 


ating the need to refer to the combinatorial factors mentioned above, and 
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/\o 
(B) (C) 


Figure 4-2: Simple examples of diagrams representing the individual terms 
occurring in the expression (4-10) for the N-particle configuration integral; (A) 
the diagram for the integral of a single f-function involving the interaction energy 
of two particles, made up of two un-numbered circles joined by a line; there are 
ite such terms int Cy, any one of these corresponding to a specific pair 
of indices, say 1,2 which would then have to be assigned to the two circles 
(the assignment 2,1 represents the same term), and the diagram would then 
represent a specific term (the integral J, in (4-1la)) in Cy rather than a group 


of terms; (B) an un-connected four-particle diagram representing a group of 
N(N—1)(N—2)(N—3) 


(A) 


terms, each represented by two disjoint two-particle clusters; 
each term belonging to this group is equal to the integral Jz in (4-11b); (C)a 
connected diagram, termed a cluster, involving three particles, in which one of 
the (un-numbered) circles is connected to each of two others, the latter having no 
connecting line between them; there are ie of such diagrams, in each 
of which distinct integer indices are assigned to the circles, the contribution of 
any of these diagrams with numbered circles being the same as J3 in (4-11b); 
the two diagrams with numbered circles shown in the box are, however, not to 
be counted as distinct because both of these correspond to the product fi2f13 as 
the integrand. 


will formally express the contribution of each diagram in Cy as a N-fold 


volume integral, thereby accounting for the factor of V"~*. 


Knowing Cy, one obtains, in principle, the N-particle canonical partition 


function Z,(V), and then the grand partition function Z, as 


“1 
Z¢ = > sag) fae Wale te a5 Ty), (4-12) 
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where, as outlined above, the N-particle configuration integral 


Cy = fee? rWe(e1.t2,-ty) 


can be expressed as a sum of all possible N-particle diagrams. The dia- 


grams and their contributions to C'y are described as follows. 


Each diagram is made up of circles (say, k in number, there being dia- 
grams for all possible values of k ranging from 2 to N) numbered with 
integer indices 1,2,---.N, where each circle is connected by a line to some 
other circle, no pair of circles being connected with more than one line 
(some pairs may not have connecting lines between them). Additionally, 
the diagram contains N — k number of lone circles, none of which is con- 
nected to any other circle in the diagram. For any given arrangement 
of the circles and connecting lines, all possible assignments of numbers 
(1,2,--- , NV) to the circles corresponding to distinct configurations (as de- 
scribed by lists of which numbered circles are connected to which others) 
give rise to a number of diagrams, all of which contribute equally to the 
Cy, which is why one can erase the assigned numbers and consider just 
one single diagram for the given arrangement of circles and the connect- 
ing lines, with an appropriate combinatorial factor (depending only on 
the topological configuration of the circles and connecting lines) multi- 
plying the contribution of that single diagram. The contribution is (apart 
from the multilicative factor) an N-fold volume integral (or, equivalently, 
a k-fold volume integral multiplied with V‘~*) with an integrand that is 
a product of factors, there being a factor for each connecting line with 


numbered circles at its ends (an f-function with these two numbers as 
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indices; for this it is sufficient to consider any single assignment of num- 


bers to the circles). 


The grand partition function Z, is the sum of all possible diagrams con- 
tributing to the Cy’s for all possible non-negative integer values of N as 
in (4-12). In order to obtain the grand potential (Q = —8~'In Z.), one has 
to work out the logarithm of Z,, which may be expressed in the form of 


an expansion (referred to as the cumulant expansion) analogous to (4-12): 
_ 1,¥\1 (31) 
In Z, y —(<5) d rU((r1,02,°--Y1), (4-13) 


where now the term / = 0 is absent in the sum (recall that, the power 
series expansion of In(1 +.) starts with the first degree term in zx, without 


a term of degree zero). 


The cumulant expansion. The basic idea underlying the cumulant expan- 
sion is to represent the logarithm of a power series (denoted by, say, 7?) in 
the form of a second power series (say, P2), and then comparing e”? and 
P, term by term in the expansion parameter which, in the present context, 
is the activity XE The comparison of the two series is carried out in the 


thermodynamic limit N > 00, V + 00, © + p. 


In statistics, P) is taken as the power series expansion of the moment gener- 
ating function of a probability distribution, where the coefficients occurring 
in the expansion are the various moments of the distribution; P2, on the 
other hand, represents the power series expansion of the cumulant gener- 
ating function (the logarithmof the moment generating function) in which 
the coefficients are the cumulants of the various orders. A comparison of 


the two series then yields the cumulants in terms of the moments. 


In the present context the cumulants are the coefficients occurring in the 
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expansion of ln Z, and are obtained as multiple integrals over the configura- 
tion space, while the expansion of Z, yields a series whose coefficients are, 
once again, multiple integrals. A comparison then yields a series of relations 


between the integrands, these being the ones expressed in (4-15b). 


A statement of crucial relevance is that, analogous to the Cy’s, the inte- 
grals | d@ rU((r1,12,---1,) can be expressed as sums of terms where the 
terms correspond to diagrams as already described, with the restriction 
that now one has to consider only the connected diagrams, or the clus- 
ters (i.e., the diagrams that cannot be decomposed into more than one 


disjoint ones), introduced above. This I explain below in brief outline. 


4.1.2.2 The grand partition function in terms of cluster integrals 
We start from the N-particle canonical partition function (with a slight 
change in notation) 


gi) : 


c= NoeN / dM pW (1, 42,+++ EN); (4-14) 


(refer to first equality in (4-8)) where (i) the function Wy is symmetric in 
r1,¥2,--- ,ry, (ii) Wy > 1 as the separation between each pair of particles 
becomes infinitely large, and (iii) Wy possesses the following separation 
property: suppose the N particles are divided into groups of n1, nz.... par- 
ticles such that each group is separated from all the other groups by 
an infinitely large distance; Wy can then be decomposed into a product 
Wyn = W,,W,,,:::. These properties ensure that the Wy’s can be expressed 


successively in terms of functions U,,U2,--- (referred to as Ursell func- 
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tions) as 


Wi (r) = Ui(r) = 1, 

W(r1,¥2) = Uo(ti, re) + Ui (11) Ui (r2), 

W3(r1, 2,3) = U3("1, re, r3) + Ui(11)Ue(re, r3) + Ui (r2)U2 (11, r3) + Ui(1r3)U2(r1, re) 
+ Uj (11)Ui(r2)Ui (r3), 


°s (4-15a) 


where these functions have the following important property: 

Ui(r1,¥2,°--r,) (1 = 2,3,---) is zero if the separation between any two of the 
! number of position vectors r,,r2,--- ,r; goes to infinity (check that this 
property of the U;’s is consistent with the separation property of the Wy’s 
mentioned above). The inverse relations defining the U;’s in terms of the 


Ww’s are then obtained as 


Ur(r) = Wilt) = 1, 

Ue(t1,t2) = Wa(ri, r2) — Wi(r1)Wi(re), 

U3(t1,02, 03) = Wa(r1, 42,23) — Wi(t1)We(r2, 23) — Wi(t2)We(r1,r3) — Wi(rs)We(r1, r2) 
+ 2W,(r1) Wi (r2)Wi (13), 


, (4-15b) 


The fact that U;(r1,r2,---r;) (J = 1,2,---) is non-zero only if all the points 
with position vectors r,,re,---r; are at finite distances from one another 
tells us that its integral corresponds to the sum of all connected /-particle 
diagrams, i.e., of all /-particle clusters. The corresponding integral in (4-13) 


we refer to as the /-particle cluster integral. What is significant about a 
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cluster integral such as f d@(r)U;(r1,1r2,---r;) (J =1,2,---) is that, for suf- 
ficiently large V, i.e., in the thermodynamic limit, it goes like V (reason 
this out; taking one of the particles as reference, one obtains an integral 
over the co-ordinate of that particle, multiplied with an integral over rela- 
tive co-ordinates, the latter being finite). It is convenient to express this, 
the sum of all possible /-particle cluster integrals for a given value of /, in 


the form 
fevers --¥y) =VIlb, (2 = 1,2,---) (4-16) 


where the coefficients b,, defined from the above equation in the thermo- 
dynamic limit, will feature in determining the equation of state, as ex- 
plained below. The proportionality of the cluster integrals to the volume 
V implies that, in this limit, the b;’s are all functions of T alone. These 


will be seen to be related to the virial coefficients occurring in (4-2). 


Looking at (4-13) and (4-16), we have 


InZ_ = Ck (4-17) 
l=1 


Once we obtain In Z, in terms of the cluster integrals (and hence in terms 
of the b;’s), we can work our way to the equation of state by making use 


of 


= keT )~ (5) (4-18a) 
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— AQ Oln Z, 
N=-30 =I 
lee) v1 
=Vvy_ Ga) Iby. (4-18b) 


These formulae hold in the thermodynamic limit, in which case, as men- 
tioned above, the b;'s may be assumed to be functions of temperature 


alone. 


1. The cluster expansion and the setting up of the resulting series (4-18a), 
(4-18b), is a diagram-based perturbation technique of great relevance 
in statistical mechanics, and has a vast literature devoted to it. Apart 
from original literature (which you will find cited in the references men- 
tioned here), I can suggest [111], chapters 15-18, [117], chapter 9, [66], 


chapter 5, [74], chapter 5, [101], chapter 12, and [67], chapter 10. 


2. The cluster expansion identifies the basic entities in terms of which 
the perturbative series expressing the equation of state is to be set up, 
these basic entities being, precisely, the cluster diagrams. An /-particle 
cluster represents the effect of simultaneous interactions of | particles 


by means of the pair potential ¢, on the equation of state. 


3. It is important to note that an /-particle cluster with | > 2 does not 
represent the effect of an /-body force (about which little is known save 
for the fact that a large body of observations on macroscopic matter 
can be accounted for without postulating the existence of such forces), 
but is indicative of the effect of simultaneous interactions of groups of 
| particles through two-body interaction forces. In the present context, 


we have assumed the two-body forces to be of the central type, though 
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non-central forces can also be accommodated in the theory. 


4. The fact that all cluster integrals fd@(r)Uj(ri,r2,--+171) (1 = 2,3,-+-) 
of the Ursell functions U;, made up of contributions from all possible 
cluster diagrams, are proportional to V (for sufficiently large V) re- 
gardless of the order /, is responsible for the thermodynamic behavior 


of the system under consideration, as implied by the expansion (4-17). 


5. Recall that, in the case of the ideal gas, the classical grand potential 
turned out to be proportional to V even for a finite system (refer to 
sec. 3.2.2.2); however, not much is to be read into this fact in view 
of the observation (see sec. 3.3.4, note following eq. (8-60b)) that the 
classical grand canonical description is a valid approximation to the 


more general quantum description only in the thermodynamic limit. 


4.1.2.3 The equation of state: the star diagrams 


The formulae (4-18a), (4-18b) are power series expansions in terms of the 


activity 
= ek (4-19) 


One can, in principle, invert the relation (4-18b) to express z as a power 
series in p = x (the mean number density that is alternatively denoted 
by n in the literature) and then substitute in (4-18a) to obtain the virial 
expansion of ;“; in terms of p. This is easier said than done since one 


has to handle so many cluster diagrams. A simplification, however, is 


seen to emerge as one looks at the clusters that contribute to the first few 
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virial coefficients — there takes place many cancellations till it transpires 
that the ony diagrams that contribute to the virial expansion are the ‘star’s 
— those among the cluster diagrams that continue to be clusters when 
any one of the circles (or dots; recall that the circles can be numbered 
or unnumbered — the latter are termed dots) is removed, along with the 
lines emanating from it. Stars are also referred to as ‘doubly connected’ 
diagrams since any dots is connected in a star to any other by at least 


two paths made up of connecting lines. 


The really relevant part in driving the virial expansion is to show that only 
the clusters contribute to the Ursell functions and, likewise, to prove that 
only the stars contribute to the virial coefficients. These proofs are omitted 
in this introductory exposition, where I present only the barest outline of 


what the virial expansion consists of. 


Referring to formula (4-2), we write down the first few virial coefficient in 


terms of the b,’s: 


B,(T) = 1, 

Bit) ==), 

B3(T) = 4bo(T)? — 2b3(T), 

By(T) = —20b2(T)? + 18b2(T)b3(T) — 3b4(T), 


; (4-20) 


where the temperature dependence of the b;’s is indicated explicitly. Re- 


281 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


call that the b;’s are determined in terms of cluster diagrams among which 
the simply connected clusters (i.e., the ones where each dot is connected 
to every other by one — and only one — path made up of connecting 
lines) occur along with the stars (a lone dot is, by definition, a star). 
However, on referring to the right hand sides of the equations in (4-20), 
one finds that the contributions from simply connected clusters get can- 
celed, leaving only the stars that ultimately determine the virial coeffi- 


cients B;(T) (J =1,2,---;Bi(T) = 1 by definition). 


As simple examples, the diagrams in part (A) of fig. 4-3 do not qualify as 


stars, while those in part (B) do. 


The final result leading to the virial coefficients reads as follows: 


i —— 
BT) =- ; f de eOs(es,22,-+-45) (I = 2,3.2%), (4-21) 


r=) 


where the integral on the right hand side for any given value of J (= 
2,3,:--) is made up of contributions coming from all the J-particle stars, 
and from these alone. Each such star, when looked upon as a diagram 
with unnumbered dots, occurs with a numerical factor corresponding to 
the number of topologically different ways that the numbers 1,2,--- ,J 
can be assigned to the dots (converting the dots to circles with numbers). 


This is briefly explained in the caption to fig. 4-3. 


We can now write down the expressions for the second and third virial co- 


efficients in terms of integrals involving the two-particle interaction func- 
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NV 
AU 


Figure 4-3: (A) Two simple connected diagrams (clusters) that do not qualify as 
stars; of these, one is a three-particle cluster while the other is a four-particle 
one; (B) two simple star diagrams; of these, one is the single three-particle star 
contributing to B3 (there is only one possible configuration in which the circles 
can be numbered with integers 1 through 3), while the other is a four-particle 
star contributing to B, that actually represents a group of three diagrams with 
numbers (1 through 4) assigned to the circles in three possible distinct configu- 
rations; there are two other four-particle stars, one corresponding to six possible 
configurations of numbered circles and the other to only one possible configura- 
tion (in all, there are ten distinct star configurations contributing to By). 


tion ¢(r): 
B,(T) = - / d®)r)d ro fig = —20 i: r2(e~89) — 1)dr, (4-22a) 
0 
1 
3(Z’) i 3V dr) d@ rod rs fio fos fis 
lf 4) 40) 1 1 
oe d qd’ QF (ql) f(JQ + sw FQ — 54): (4-22b) 


Here, in the last expression, we have first transformed from integration 


variables r,r3 to Q = $(r2+r3—2r;) and q = r3—r2, and then integrated over 
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r,, while there remains the integrations over volume elements d‘)q, d°)Q. 


The next section (sec. 4.1.3) is devoted to a brief outline of a few models 
of the two-body potential ¢(r) for which the first few virial coefficients 
have been worked out, either analytically or by numerical integration, 
taking into account the contributions of the relevant star diagrams. This 
section also includes a few words on how these results compare with 
experimental observations on real gases and with computer simulations 


of the behavior of an assembly of identical particles. 


An overview on the classical virial expansion, along with pointers to its limi- 
tations, is to be found in [101], [117]. For further details on various aspects 


of the Classical virial expansion, see [100]. 


From a practical point of view, the condition for the validity of the virial 


expansion can be expressed as 


Bo(T)p << 1. (4-23) 


4.1.3 The classical virial expansion: results 


Generally speaking, the equation of state (4-2), truncated after a relatively 
small number of terms (see below) for which the virial coefficients have 
been evaluated with appropriate models for the potential function ¢(r), 
agree well with experimental observations on real gases at relatively low 
pressures and high temperatures, and with numerical simulations of the 
behavior of assemblies of identical particles. In other words, the deviation 
from the ideal behavior (eq. (4-1e)) is well explained, in the case of dilute 


gases, by the theory underlying the classical virial expansion. 
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Theoretical predictions depend to some extent on the choice of the inter- 
particle potential ¢(r), where the choice becomes necessary due to the 
fact that the calculation of the potential from more fundamental princi- 
ples (based on nuclear and atomic structures) constitutes too complex a 
problem to be of any use in the present context. Experimental studies on 
atomic interactions indicate that the potential is to have a hard repulsive 
core (¢(r) large and positive for small r) and a short range attractive tail 
(¢(r) negative and rapidly decreasing at large r). The repulsive core is 
principally a consequence of the exclusion principle while the attractive 
tail arises in virtue of fluctuation-induced dipolar interactions between 


neutral atoms or molecules. 


4.1.3.1 Intermolecular potential: simple models 


The simplest potential consistent with the above requirements corre- 
sponds to the hard sphere potential involving an infinitely repulsive core 


for r < o (the ‘collision diameter’), and no attraction or repulsion for r > oc: 


[hard sphere potential :| $(r) =oo for r <a 
=0 forr>o0 
ie, f(r) =—-lforr<o 


=() for r > a; (4-24) 


as depicted in fig. 4-4(A). The hard sphere potential, which is based on 
the assumption that two molecules experience an infinite repulsion when 


they are at a certain minimum distance from each other and no force oth- 
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erwise, is characterized by just one single adjustable parameter co, which 
detracts to same extent from its usefulness, a potential more realistic but 
still simple in form being the ‘square well’ potential depicted in fig. 4-4(B). 
Here, in addition to the infinitely repulsive core for 0 < r < o, there isa 
region of attractive interaction for o < r < ro, ro being the range of attrac- 


tion, beyond which there is no interaction, either attractive or repulsive: 


[square well potential :] (r) =oo for r <a 
=—e forao<r<r7ro 
=0 for r > 10 
ie, f(r) =—lforr<o 
=e __] fora <r<ro 


=) for 7 > 7%, (4-25) 


The usefulness of the square well potential lies in its simple form and in 
the fact that it accommodates three adjustable parameters (0,79, and ¢ , 
the last having the significance of the strength of the attractive interac- 
tion). As we will see, this potential gives a second virial coefficient that 


results in the van der Waals equation of state (refer to (4-31) below). 


Finally, fig. 4-4(C) depicts the Lennard-Jones (‘LJ’) potential, widely used 
in the calculation of the second and higher virial coefficients, in terms of 
which the theory underlying the virial expansion can be compared against 
the actually observed behavior of dilute gases and the behavior inferred 


from numerical simulations in molecular dynamics studies. This potential 
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is defined as 


[LJ potential :]  ¢(r) = 4e[(2)? = (=), (4-26) 


r 


and is characterized by two parameters, namely, « and o, that specify the 
strength and the range (of the repulsive core) respectively (¢ = 0 at r =o, 
and the repulsion and attraction balance each other at r = 26c). Quite a 
few variants of this basic ‘6-12 potential’ have been considered in the lit- 
erature, with additional parameters introduced so as to meet specific re- 
quirements, and virial coefficients have been computed numerically with 


these as also with the basic ‘6-12’ potential. 


4.1.3.2 The second and the third virial coefficients 


The hard sphere potential is of such a simple form that quite a few virial 
coefficients (B,, B2,---) can be worked out analytically or numerically. For 
instance, one has the following results for the second and the third virial 
coefficients for the hard sphere potential, where both are seen to be in- 
dependent of temperature (check these out from (4-22a), and from the 
second form in (4-22b), i.e., the one involving the transformed variables 


q and Q; refer to [85]): 


(4-27) 


II 
| 
Q 
x 
II 
| 


[hard sphere potential :] By» 


The second virial coefficient for the hard sphere potential is obtained as 
By = 2n {J r’dr. As for the third virial coefficient, note from the second 


form in (4-22b) and from the expression for f(r) in (4-24), that it can be 
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(7) pr) 


(A) (B) 


P(r) 


(C) 


Figure 4-4: Simple models of intermolecular potentials ¢(r) (schematic); (A) the 
hard sphere potential which has an infinitely strong repulsive core for r < o (the 
‘collision diameter’) and zero interaction for r > o (see (4-24)); (B) the square 
well potential (see (4-25)), which has an infinitely strong repulsive core as in 
the hard sphere case but, at the same time, involves an attractive interaction 
for 0 < r < ro, ro being the range of the latter; apart from the parameters oa, 70, 
the square well potential is characterized by the strength ¢ of the attractive 
interaction; (C) the Lennard-Jones potential (eq. (4-26)) where the repulsive and 
the attractive forces vary continuously with separation; the repulsive barrier is 
steep while the attractive force falls to zero with the separation continuously but 
rapidly, with a range around rp) shown in the figure. 


expressed as 3 f{ d‘)qu(q;c), where v(q;c) is the volume of the shaded region 
of intersection of two spheres, each of radius o, the distance between their 


centers being q(< co) (reason out why; refer to fig. 4-5). 
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Figure 4-5: Geometrical construction for working out the third virial coefficient 
(B3) for the hard sphere potential; vectors Q and q are introduced for three 
particles at positions r),r2,r3, as explained in paragraph following (4-22b); Bs 
is then obtained as (3 times) the integral over the volume of a region where q, 
Q+ $4: Q- $4 are all less than o; spheres are drawn with radius o around A and 
B as centers, where the vector extending from A to B denotes q; the distance AB 
is less than o, which guarantees one of the three constraints mentioned above; 
Q is represented by the vector extending from O (the midpoint of AB) to the 
variable point P, which has to lie within the shaded region so as to satisfy the 
other two constraints. 


One finds that, with the hard sphere potential, the virial coefficients are 
independent of temperature, which leads to the consequence that the 
pressure of the gas is higher than the ideal gas pressure at any given 
temperature and density; this is explained by the absence of an attractive 
part (referred as the ‘attractive tail’ in the graphical representation of ¢(r)) 


in the hard sphere potential. 


In contrast, the second virial coefficient for the square well potential is 


given by 


[square well potential :| By = [1 - =)" =1)(er2 = 1)], (4-28) 
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(check this out from (4-22a), making use of (4-25)) which implies a neg- 
ative correction to the ideal gas pressure at low temperatures and a pos- 
itive correction at high temperatures, and is thus more in keeping with 
observed behavior of real dilute gases. At low temperatures, the attractive 
part of the interaction tends to cancel part of the pressure arising due to 
the thermal motion, an effect that tends to get balanced, especially at rel- 
atively high temperatures, by the repulsive core which implies a positive 


pressure correction due to the so-called ‘excluded volume’ effect. 


At sufficiently high temperatures one can write, for the square well po- 


tential, 
Il potential :] B ane i= =e | (4-29) 
re W Ntlal : y = = = —— = = 
square well pote 2 3 = kT!’ 
1.€.; 
Pn =h=— (4-30a) 
2 er 
where 
27 To 20 
ar o€o(( y>-1), b= 3°. (4-30b) 


Supposing that the equation of state (4-2) is truncated after the Bjp term, 


this gives an equation of state 


—— = p(1+bp— (4-31) 


oes ) 
kel kp”? 


which, at the present level of approximation, agrees with the van der 
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Waals equation of state. 


The van der waals equation of state reads 


aN? 
where a,b are the van der Waals constants per molecule (the molar van der 
Waals constants commonly referred to are then aA? and bA respectively, 
where A stands for the Avogadro number). This reduces to (4-31) when one 


ignores bp? compared to unity, which is consistent with ignoring the third 


and higher order virial coefficients. 


The third virial coefficient can also be worked out analytically for the 
square well potential. However, attempts at fully analytical evaluations of 
the third and higher virial coefficients quickly become pointless, and nu- 
merical evaluations become far more fruitful, especially with the Lennard- 
Jones potential or some related variant, where the parameters defining 
the potential are determined by comparing with chosen experimental data 


for real gases. 


The parameters (such as o,7o,«¢ for the square well potential) differ from one 
gas to another. For a given gas, these are determined from a chosen set 
of experimental data by appropriate curve fitting techniques, and are then 
made use of in predicting further theoretical data points under more varied 
conditions of temperature and pressure, which finally leads to a compari- 
son with corresponding data points obtained experimentally. As mentioned 
earlier, simulations based on molecular dynamics are also used in addition 


to experimentally observed data. 
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In calculating the virial coefficient of a given order, one commonly uses 
parameter values determined by comparing the experimental with the the- 
oretically predicted data obtained from the previous level of approximation 
(i.e., with the virial coefficient of the immediately preceding order) and then 


makes appropriate refinements. 


We turn our attention now to the Lennard-Jones potential, which has 

been widely used in the numerical evaluation of the second and higher 

virial coefficients. A commonly used approach here (as in the case of 

other analogous potentials) is to use dimensionless variables and dimen- 

sionless virial coefficients. Thus, defining the dimensionless variables 
kpl 


ee ee (4-33a) 
Oo € 


termed the reduced intermolecular separation and the reduced temper- 
ature respectively, one obtains the following expression for the reduced 


second virial coefficient as a function of the reduced temperature: 


= pan — BolT) 


BT) = ; —3 ia drx’| exp ( - a —« °))-1] (b) = —o%), 


(4-33b) 


where by stands for the second virial coefficient for the hard sphere po- 


tential with the same collision diameter as co. 


Assuming that the intermolecular pair potential has the same form for all 
substances though, perhaps, with various different values of the range and 
energy parameters (c,«¢ of eq. (4-26) in the case of the Lennard-Jones poten- 


tial), the reduced virial coefficients B,,B3,:-. appear as universal functions 
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of the reduced temperature 7. 


Methods have been devised to determine the virial coefficients of successive 
orders (in particular, Bj, B3) from experimental data in stages, where the 
experimental data correspond to successively higher ranges of pressure and 


density. 


Experimental data for a number of gases do indeed conform, to a con- 
siderable degree, to the theoretically predicted dependence of the re- 
duced Lennard-Jones second virial coefficient on the reduced tempera- 
ture. A similar comparison in the case of the reduced third virial coeffi- 
cient shows up some discrepancy, presumably owing to the non-additive 


nature of the interaction potential for three particles. 


4.1.3.3 Digression: the van der Waals equation as a mean field ap- 


proximation 


We saw in sec. 4.1.3.2 that the equation of state worked out from the 
second virial coefficient (which means that the third and higher virial 
coefficients are ignored) reduces to the van der Waals equation if the pair 
potential ¢ is assumed to be of the square well form (4-25). In this case, 
the idea of ignoring the third virial coefficient onward is equivalent to a 
mean field approximation, where each constituent particle is assumed 
to be characterized by a constant mean potential energy due to all the 
other particles taken together, and fluctuations over this mean energy are 
ignored. This is a reasonably good assumption at low densities (ones for 


which it suffices to truncate the virial series at the first approximation) 
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where the leading term of the virial series corresponds to the potential 


energy of each particle being zero. 


In other words the partition function for a system of N particles is ob- 
tained by assuming a potential energy, say, ¢ 9 per particle where ¢p is to 
be worked out from the mean potential energy of the system in the ideal 


gas approximation, the expression for the potential energy function being 


P= S> o(rig); (4-34) 

{ij} 
where ¢(r) is the pair potential (refer to (4-25) for the square well; other 
forms of the potential can also be admitted) and the summation extends 


over all pairs (i-j) of molecules, there being ‘a 


such pairs. The mean 
potential energy of the system, calculated in the ideal gas approximation, 


is then 
(6) = Sa | tnd ra6(r) (4-35) 


(check this out) where the summation over pairs is replaced with Nw) 
times the potential energy of a typical pair. The mean potential energy 
per particle, calculated with the square well potential, is then given by 


(we consider N —> oo) 


a, (4-36) 


with a as in (4-30b) (check this out). 


294 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


In other words, we can now think of the system as a collection of N 


independent particles, each with a classical energy function 


Pp 
A = —+ 9, (4-37) 


2m 


for which the canonical partition function is given by 


1 N 
© NIRBN ? 
z= f drd®pe Mir), (4-38a) 


where now the volume integral in z is assumed yield V — VY instead of V 
because of the ‘excluded volume’ effect. For a collision diameter (distance 
of closest approach between two molecules) co, the excluded volume per 
molecule is b = ana (reason this out; this is the same expression as in the 
second equation in (4-30b); it is the collision diameter that enters into the 
expression for the intermolecular potential (4-25)), and thus the partition 


function works out to 


1 


Zo = NigN 


(Vie Nae NPM, (4-38b) 
with ¢ given by the last expression in (4-36). This yields the following 


expression for the pressure: 


Na 
V—Nb’ 


OnZ, | N? 


BV 2 a (4-39a) 


p=e 


(check this out) which, on rearranging terms, gives the van der Waals 


295 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


equation 


), (4-39b) 


(0+ Sub) = kT w=> 


the same as the equation (4-32) obtained by using the square well po- 
tential and truncating the virial expansion at the second term (check this 


out by reconciling the slight difference in notations). 


Fig. 4-6 depicts a set of isothermal curves conforming to the van der 
Waals equation of state, in which the curve ABCDEF has a looped struc- 
ture and includes a region of instability (the part CD), along with regions 
(BC, DE) corresponding to superheated liquid and supercooled vapor. The 
curve A’‘B’C’ corresponds to the critical temperature, while the curve A”B” 
corresponds to the gas phase above the critical temperature. While the 
latter two curves are reproduced in experimental data, the first one is 
not, since it involves unstable states not consistent with the physical 
conditions under which the behavior of the system is examined. The 
isotherm that is actually observed includes the straight portion (BPE; 
corresponding to coexistence of vapor and liquid phases) along with the 
parts AB and EF corresponding to the pure liquid phase and the pure 
vapour phase. The straight line portion BE is obtained by employing the 
so-called ‘equal-area’ construction of Maxwell (i.e., by making the areas 


BCP and PDE equal to each other). 


The mean field theory sketched above is, on the face of it, an exact pro- 
cedure for the calculation of the partition function, but it ignores the 


fluctuations in the mutual potential energy between any two molecules 
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Figure 4-6: Isothermal curves according to the van der Waals equation of state 
( (4-39b)) - schematic; ABCDEF is an isothermal curve below the critical temper- 
ature, with a looped structure, and implies a phase transition, with the parts AB 
and EF representing the vapor and the liquid phases; the part CPD represents 
a succession of unstable states, for which (#)- is positive, while the parts BC 
and DE are not realized in practice; the linear portion BPE is obtained by em- 
ploying the ‘equal area’ construction of Maxwell, and represents a succession of 
coexisting phases; the curve A’B’C’ is the critical isotherm, with B’ as the critical 
point; A”B” represents an isothermal curve above the critical temperature. 

of the gas. It is an interesting question in statistical mechanics as to 
whether an isotherm such as ABPEF can be obtained in some realistic 
model, for which the partition function can be obtained exactly in the 
thermodynamic limit. Evidently, the van der Waals equation can serve as 
an important guide in obtaining an equation of state from such a partition 


function. 


The van der Waals equation has played an important role in the history of 


investigaitons on the gas-liquid transition. It is the simplest model of a fluid 
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that reproduces a number of observed features of gas-liquid transitions, 
including the existence of a critical point where the difference between a 
liquid and its vapor disappears. This is because, in contrast to the hard 
sphere model, the van der Waals equation results under the combined effect 


of a short range repulsion and a longer range attraction between molecules. 


4.2 Quantum corrections to classical virial co- 


efficients 


The classical virial expansion is of limited relevance in the theory of in- 
teracting systems. Though it describes the behavior of interacting gases 
at relatively low pressures and high temperatures, i.e., at low densities, 
in the form of corrections over the ideal gas behavior, it does not take 
into account quantum effects and is therefore suspect from a theoretical 


point of view. 


It is possible, in principle, to introduce quantum corrections to the clas- 
sical virial expansion in the form of an expansion in powers of the Planck 
constant h though, as far as concrete results of practical relevance are 
concerned, such calculations make sense only for the first few terms 
of the virial series. In particular, instructive results can be derived for 
the second virial coefficient B, by referring to a scheme for working out 
the quantum corrections to the classical partition function, developed by 


Kirkwood. 
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4.2.1 The classical limit for interacting systems 


This section is based on [66], chapter 3, section 16. 


In this context, it is instructive to derive the classical expression for the 
canonical partition function Z, (expression(2-66b)) as the leading approx- 
imation, as h goes to zero, of the quantum mechanical expression for a 
system of NV number of identical particles, which we write as (the caret (°) 


over a symbol denotes a quantum mechanical operator) 


Ss ad 


2. = Tr{e*#} = Trexp[—8()” = + #({4}))] (4-40) 


where s = 3N (N =number of particles making up the system under 
consideration; for notation, see sec. 1.3.3.1; check (4-40) against(2-15b)). 
Note that, even though we seek an approximation for vanishingly small 
h, the classical expression involves a factor of wy in it. This, however, 
does not lead to any singularity in the expressions for the thermodynamic 


function resulting from it. 


Recall that the classical limit was obtained in sec 3.3.4 for the ideal 
gas by way of adopting the Maxwell-Boltzmann protocol of state count- 


ing and going over to the thermodynamic limit. 
As can be inferred from sec. 1.3.3.1, the point of departure of the quan- 


tum from the classical theory is the non-zero value of the fundamen- 


tal commutators (1-10). In other words, the classical limit is obtained 


299 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


from (4-40) by replacing it with 


Z. = Te{lexp (— 87 PL) exp ~(80((4}))]}: (4-41) 


i=1 


which is obtained by assuming that all the operators 7;, 4; (i = 1,2,--- ,s) 
commute with each other. This follows from the fact that, for two com- 
muting operators A, B, the identity cAtB — eAcB holds. 

To proceed further, we have to use properly symmetrized basis func- 
tions (i.e., completely symmetric and completely anti-symmetric ones for 
bosons and fermions respectively). If w,,(Q) (m = 0,1,2,---) the properly 
symmetrized and normalized energy eigenfunctions, then the partition 


function (first equality in (4-40)) can be expressed as 


2. = f dQui(Qhe*"’m(Q), (4-42) 


where we adopt the following notation. As in earlier sections, Q stands 
for the set of position co-ordinates {r,,r2,--- ,ry} for the system under 
consideration, made up of N number of identical particles while dQ repre- 
sents the 3N-dimensional volume element dQ = d@r,d@r.---d@ry. With 
reference to the ordering, {ri,r2,--- ,ryv}, of the position vectors, we will 
have occasion to refer to permutations in the ordering, where a typical 


permutation (P,) is characterized by a parity |P, 


, telling us whether the 


permutation is an even (|P,| = +1) or an odd (|P,| = —1) one. 


Since the trace is invariant under a change of basis, we effect a transfor- 


mation from the energy basis (with w,,(Q) as the basis functions) to the 
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momentum basis where the basis functions are, say, ¢p(Q) correspond- 
ing to the set of momenta P = {pj,p2,---,pn}. Ignoring, for the time 
being, the symmetry requirements on the wave functions and their nor- 
malization, the functions ¢(P) appear, in the co-ordinate representation, 


as plane waves of the form 


. oN 
bp(Q) ~ er? @ = exp F Spi : rl 
i=1 


and the transformation from the energy basis to the momentum basis is 
effected by means of Fourier transformation. However, an appropriate 
symmetrization of the wave functions is an essential requirement in the 
quantum mechanical description of systems of identical particles, where 


a typical wave function in the momentum basis appears in the form 


o(P,Q)= oa > Cpek PP), (4-43a) 
"Py 

o(P,Q) = = S- Gpen Pe? @, (4-43b) 
eos 


Here, P,, P, stand for permutation operators on sets of position and mo- 
mentum vectors, as indicated below. With reference to the former, Cp, is 
defined to be +1 for a system of bosons and (—1)! for one made up of 


fermions. Further, P- (P,Q) stands for 


N 
Be (PQ) = S "Ps eae) 
k=1 
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where the indices 7; (k = 1,2,--- ,N) are defined as 


Pr{¥1; 12, BERS tn} = eee =< Jets 


thereby defining the action of the permutation operator P,. The operator 
P, acts on the set of momentum vectors {p;, p2,--: , py} in an analogous 


Manner. 


The symmetrization is achieved by means of the summation over all the 
distinct permutations P, of the set {r,,r2,--- ,ryv} on the right hand side 
of (4-43a). The expression in (4-43b) is obtained by replacing the permu- 
tation of position vectors with one acting on the set of momentum vectors, 
satisfying the requirement P- P,Q = P,P -Q. Each of the permutations P, 
is associated with a corresponding ?,, and the summation over all distinct 
permutations ?, can then be replaced with a summation over all distinct 


Py 8. 


Expressing the trace in the canonical partition function Z, in terms of 
the symmetrized momentum eigenfunctions ¢(P, Q) instead of the energy 


eigenfunctions w,,(Q), one obtains 


i. of 


c= 17 7BN / dPdQ¢o*(P, Q)e*" O(P, Q), (4-44a) 


where the factor + arises from the normalization condition of the func- 
tions v,,(Q), and the factor 7 arises on converting the summation over 


m to an integration over the set of momentum variables P [66]. On using 
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the expression (4-43b), one obtains 


1 2 1 _ip. _—BH , ip. 
Zo = (WW) han ys Go.cm, | APAQP oe Brwe pa en a (4-44b) 
PpPt, 
For given permutations P,, P),, if P,-' be the inverse of P/, one can define 
a new permutation P/ = Pi-'P,, such that Cer = Gp, Cpy, (check this out). 
Hence, the summation over P; in (4-44b) can be performed by replac- 


ing Lp, Cp,¢p, with N! times dope Cpy (reason this out). We now rename 


dipn Gpy aS Dip, Gp, for the sake of brevity, and w have 


L J 


= sagan Libr, f aPagiPye teh. (4-45) 
Taw 


This is formally a summation over phase space integrals where, in each 
term of the summation, there occurs the factor e~*"er”@ in the integrand 
that one now has to work out. As we see below, one can express this 
as a power series in fi, the leading term of which is the classical expres- 
sion cei? ®@, The two differ in the fact that, while e~°” is the classical 
phase space function exp [-8{7,(2E) + ®(r1,r2,--- , rv) }], e84 is a differ- 
ential operator in th co-ordinate representation such that its action ona 


function f(Q) is given by 


A f(Q) = — » (=) O(r1,¥2,°°- tw) f(ti,8e,°°: ,tn). (4-46) 


7 


The quantum mechanical expression (4-45) involves the summation over 
the permutations P,. Each term in this sum gives a power series in 
h as indicated above. In the following, we explain the structure of the 


resulting expression for Z. (eq. (4-45)) and its relation to the classical 
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partition function. 


For this, we look at the expression 
F(Q, P; B) = PR en? @, (4-47a) 
which we write as 
F(Q, P; B) =e PHO en? Qu (Q, P; 8). (4-47b) 


These two formulae define the functions F,w that we now set out to de- 


termine in terms of a power series in h. 


The problem here lies in the determination of the action of the operator 
e 84 on ex”, In comparison, it is simpler to evaluate the action of —H 
or some finite power of it on a given function, as in (4-46). This is made 
possible by noting that the function F' satisfies the differential equation 


(referred to as the Bloch equation) 


OF 4 
—=-HF 4-4 
subject to the boundary condition 
F(8 =0) =enP@, (4-48b) 


Replacing F' with the expression in (4-47b), and working through a few 
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steps of algebra we obtain the following differential equation (in 3) for w 


ae a [V2w — BwV® — 26Vw-V® + B?w(VO)?] + ii -P— BwV®- P) 
Bo &%m m 
=M(6;Q, P;w) (say), (4-49a) 


subject to the boundary condition 
wie =O), (4-49b) 


(check this out). Note that the right hand side of the differential equa- 
tion for w involves the differential operator V with N number of vector 


components 
0 O 0 
eR oe lon 


At this stage we assume an expansion of the form 
w = w(Q;P,8) = So hui, (4-50) 
1=0 


and, substituting in (4-49a), equate like powers in h, while making use 
of the boundary condition (4-49b). This is conveniently done by writ- 
ing (4-49a), (4-49b) in the form of an integral equation 


B 
inOpP evi : M(Q;P, 8) dB. (4-51) 
0 
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One thereby obtains, in successive stages of approximation, 


Wo = 1, 
wy) 
W= 1 ve P, 
2m 
Ws = as PZ ve - B vey? + 26 -V)?O} + B ve P)?] 
Im! 2 3 


ve (4-52) 


(check this out; refer to [66], chapter 3, sec. 16). One can now sub- 
stitute in (4-50) and then in (4-47b) so as to arrive at a series expan- 
sion of F(Q, P; 3) in powers of h. Finally, we make use of this expansion 


in (4-44b), now written as 


= Hae Lo / dPdQ[Ppe-*? 9\F(Q, P; 8). (4-53) 


Equivalently, one can directly make use of (4-50) to write, 


(oe) 


L ol L£7.67.— ip. 
Zo = Sagan Libr» f APAQ[Ppe#Pe“MOM CAPS” Hu!), (4-54) 
Pp 


1=0 


where the expressions for w);, as obtained from (4-52), are to be inserted. 


In other words, contributions to Z, can be grouped into two distinct sets, 
each appearing as a power series in i. Of these two, one set arises from 
the identity permutation (P,, = identity) which yields a power series regard- 
less of the quantum mechanical symmetry requirements on the particles 
making up the system under consideration, while the other set appears 


specifically due to the symmetry requirements that distinguish between 
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bosons and fermions. Thus, one can write 


G2 ZO tie Z), (4-55a) 
where 
1 
Zu) =aqan | aPaQe POPC + wih 4 woh? -++) 
=So xin (say), (4-55b) 
0 
and 


1 1 
(2) _ 5 —BH(Q,P) 
vas = Nips dPdQe CP. [Pp exp (—,P b Q)| 


PpHAidentity 


x exp (FP -Q)(L+ wii + woh? +--+) 


= Yih! (say). (4-55c) 
0 


In keeping with the definition of a permutation operator acting on the or- 
dered set P = {pi,p2,--: ,pw,}, the factor P, exp(—4P -Q) in the above ex- 
pression stands for el-n(Liii Put), where Py, acting on P, yields the set 


PpP = {Pu;, Pue, eet »Puy }. 


The various terms appearing in the above expansion for Z can now be 


interpreted as follows. 


The term Xo = y7sv f dPdQe-°"@") represents, as we have worked out 


in sec. 4.1, the classical limit of the partition function for a system of 
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indistinguishable particles interacting through the potential energy func- 
tion 6, where one need not distinguish between bosons and fermions. 
The succeeding terms in ZY represent quantum corrections to Xo of var- 
ious orders in fh, again without regard to the specific symmetry require- 
ment on the joint state vectors of the particles, i.e., without regard to 
whether the particles are bosons or fermions, where these quantum cor- 
rections depend on the interaction energy function ® through the func- 
tions w; (J = 1,2,---) obtained from the Bloch equation (refer to (4-48a), 
along with (4-52)). The leading term Xo, on the other hand, continues to 
remain even in the absence of interactions, when it represents, precisely, 


the partition function of the classical ideal gas. 


In contrast to Viton the contribution ZO) to Z, (see (4-55c)) depends on 
whether the system under consideration is made up of bosons or fermions 
(through the various possible permutations P, (NV! in number), associated 
with their characteristic parity numbers ¢p,). Analogous to Z, Z) also 
possesses a series expansion in hf, the leading term of which, namely, 
Yo, gives the symmetry-dependent quantum correction to the classical 


partition function X>, where the correction depends on the thermal de 


Broglie wavelength \;, vanishing in the limit \7; —> 0. 


This can be seen by looking at terms in Yp corresponding to the permutation 
interchanging any two specified indices, say, 1 and 2, which means that 
PpP = {p2,Pi,p3--: , Pw} (refer to [66], chapter 3, sec. 16). One observes 
that the contribution of each of these terms vanishes exponentially as \; > 


0. 
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Thus, in the case of a system of non-interacting particles, Y, describes 
the deviation of an ideal FD or BE gas from the classical ideal gas, the 
behavior of the latter being described by Xo. In the thermodynamic limit, 
the deviation described by Yo itself appears in the form of an infinite se- 
ries and, in the leading approximation, this deviation is of the order of \3. 
(i.e., of the order of h®, for any given temperature T; see sections 3.3.6.1, 
3.3.7.1). As seen from (4-55c), this leading correction is followed by cor- 
rection terms of higher orders in h, all of which depend on the intermolec- 


ular interaction ® and vanish as ©® -> 0. 


With the above approximation scheme for Z, in place, one can, in prin- 
ciple, work out the grand partition function Z, and then obtain the virial 
expansion for the equation of state as outlined in sec. 4.1, where the suc- 
cessive virial coefficients B2, B3,--- now appear in the form of power series 
in h (involving other relevant system parameters as well), constituting the 


quantum corrections to the classical virial expansion. 


One can also obtain the virial expansion directly from Z, by going to the 


limit N — oo. 


Quantum mechanical corrections to the classical limit of the partition 
function for interacting systems can also be worked out systematically in 
an elegant scheme of approximation based on the Wigner representation 
of states and observables of quantum mechanical systems (the Wigner 
representation is briefly outlined in sec. 10.2.4). The basic idea underly- 


ing this approximation scheme is to be found in [2], chapter 2. 
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4.2.2 Quantum mechanical corrections to the second 


virial coefficient 


I close this section by writing out the first two terms of the correction to 
the second virial coefficient B, while skipping the relevant derivations (we 
assume, as before, that the potential energy function © can be expressed 
as a sum of pair potentials ¢;,): 

h? 1 he 


_ pi(classical) | = —Bd(r) de 2.2 \ 
B,=B e€ —)*r*dr + etree. 
y 24rm(kpT)3 / an! 4y/2 (2nmkpT)2 


(4-56) 


where the upper and lower signs in the third term apply to the FD and 


the BE cases respectively. 


(classical) 


In this expression, Bj stands for the second virial coefficient in the 
classical limit, giving the correction to the equation of state of the clas- 
sical ideal gas arising due to weak interactions among the constituent 
particles, where this correction is inherent in the first term (X,) in the 
expansion of Zo in (4-55b) (refer to (4-55a)). Quantum corrections to By 
arise on several counts. The second term (refer to [86], section 77) on the 
right hand side of (4-56), proportional to h?, arises from X» in (4-55b), 
the contribution of X, being zero owing to the fact that the momentum 
integral involving w is identically zero (w, is an odd function of the mo- 
menta while e~’”(@-) is an even function of these). Recall that the X;’s 
(k = 1,2,---) represent that part of the quantum corrections to the parti- 


tion function which do not depend specifically on the bosonic or fermionic 
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nature of the system under consideration. The subsequent corrections to 
B2 coming from Z\” are of degree four (and higher) in h, and have not 


been included in (4-56). 


The correction of degree three in h, i.e., the third term in (4-56), comes 
from the leading contribution (i.e., Yo) to Z), the latter being the contri- 
bution to Z, that specifically distinguishes bosons from fermions. In the 
case of an ideal gas, it is Yo that tells us how a FD gas differs from a BE 
one. The leading contribution to Yo, resulting in the third term (of the 
order of h?) in (4-56) has already been encountered in chapter 3 (refer to 


formulas (3-73c) and (3-82). 


Corrections to B, involving terms of degree four (and higher) in h are not 
of appreciable practical relevance, and have not been included in (4-56) 
though all these can be determined, in principle, from the Kirkwood ex- 


pansion outlined above. 


In summary, quantum effects on the equation of state depend primarily 
on the thermal de Broglie wavelength 7 in relation to (a) the mean sep- 
aration between the molecules and (b) the effective range and strength of 
the intermolecular interaction. The Planck constant (h), in combination 
with parameters like the molecular mass (m), the temperature (7), and 
the range (c) and strength («) of the interaction determine how and to 
what extent the behavior of the gas departs from the classical equation of 


state. 


The quantum mechanical corrections for a dilute gas allow one to retain the 
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same form of the equation of state as the classical equation pV = NkpT, 
where the right hand side gets modified by the virial expansion. The correc- 
tions (in particular, the leading one involving the second virial coefficient) 
can be interpreted as resulting from a quantum mechanical ‘interaction’ be- 
tween the molecules - an effectively repelling one in the case of fermions 


and attracting in the case of bosons - arising from the indistinguishability 


effect. 


4.3 The quantum virial expansion 


It is possible to set up formally a quantum mechanical virial expansion 


for the equation of state where the expansion parameter is the density (p), 


and the Planck constant h does not appear as an additional expansion 


parameter, as it did in the expansion outlined in sec. 4.2, where the virial 


expansion appeared in the form of quantum corrections to the classical 


expansion. 


This quantum virial expansion can be set up by referring to the cumulant 


expansion in the classical case, where the functions W;(ri, re, --- 


1,2,---) are defined by referring to the formula 


— ol 
= NBN 


il 


sO aa tN) Re 


according to which 


Wi(ri, fg. 244 Vp) = e PP(r1.r2,- re) (k = 1, 2, a ) 
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The transformation to the cumulant functions U;(r1,r2,--- ,r;) (J = 1,2,---) 


is then defined as in (4-15a), (4-15b). 


Recall, from sec 4.1.2.2 that, in order to evaluate the canonical parti- 
tion function for a given particle number NV, one needs the cumulants 
U, for | = 1,2,--- ,N which, in turn, are defined in terms of the func- 
tions W;, for k = 1,2,--- , N. The grand partition function, on the other 


hand, depends on the W;,’s defined for all positive integer values of k. 


However, it is important to note that the transformation itself does not de- 
pend on the specific form (4-57b) of the functions W;, which is precisely 
why an analogous transformation can be introduced in the quantum case 
as well by expressing the quantum mechanical canonical partition func- 


tion in the form 


1 x 
Ze = errs — Na Y fae eezrn es tue Pf dn(¥1, Ye, ses Ty); 
N . n 


i 


~ Wha fee? rWe(er.ne os G7: Tn), (4-58a) 
meee By 


i.e., by defining 
Wy(ri, To,°°° Tn) = Sodan, To,°°° tye eed, (r1, |b ae Tn), (4-58b) 


where the energy eigenfunctions y),(r1,r2,--- ,ryv) (n = 1,2,---) are not nor- 
malized in the usual manner as f[ d°)r|¢,|? = 1, but by the requirement 
(refer to sec. 4.1.2.2, paragraph following eq. (4-14)) that Wy — 1 as the 


separations between all pairs of position vectors in the set {r,,ro,--- ,ry} 
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are made arbitrarily large (refer to the factor ~js, outside the integral 
fee! i 


in (4-58a)). 


One can now define the cumulant functions U;(r1,r2,--- ,r;) (J = 1,2,---) 
once again as in (4-15a), (4-15b), but this time in the quantum context, 
since it may be checked in a straightforward manner that the functions 
W,, (k = 1,2,---) defined by means of (4-58b) satisfy all the requirements 
stated in sec. 4.1.2.2 (reason this out by making use of the fact that, for 
a composite system made up of two or more non-interacting subsystems, 
the energy eigenvalues are given by sums of eigenvalues of the subsys- 
tems and the corresponding eigenfunctions appear as products of the 


eigenfunctions pertaining to the subsystems). 


Kahn and Uhlenbeck ([72]) made use of this fact to conclude that a quan- 
tum virial expansion can be set up by first defining a set of coefficients 
b (1 = 1,2,---) as in (4-16) (the integral on the left hand side of this equa- 
tion goes like V in the limit V — oo in the quantum context as well be- 
cause of the properties of the functions W;, (k = 1,2,---) mentioned above), 
in terms of which the formula (4-17) holds for the quantum mechanical 
grand partition function as for the classical one. On going over to the 
thermodynamic limit, when the };’s appear as functions of temperature 
alone, one ends up with the virial expansion (4-2), now in the quantum 
context, where the virial coefficients are obtained from the quantum me- 


chanical coefficients b; (J = 1,2,---) as in(4-20) (refer to (4-21)). 


There is, however, a substantial price to pay. In the classical case, if 


the potential energy function ®(r,,r2,--- ,r;,) decomposes as the sum of 
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pair potentials ¢(r;;) (¢,j = 1,2,---k), then the formula (4-57b) for the 
functions W,, (k = 1,2,---) leads to the cluster expansion where each of 
the coefficients b; (J = 1,2,---) can be evaluated from the /-particle cluster 
diagrams by a finite algorithm, since there occur only a finite number 
of cluster diagrams for any given /. In the quantum case, on the other 
hand, the definition (4-58b), stands in the way of setting up such a finite 
algorithm since it involves an infinite set of eigenfunction w,, each of 
which satisfies an N-particle Schrédinger equation for which there exists, 
in general, no finite algorithm for its solution. It is in this sense that the 


quantum virial expansion is only a formal one. 


Beth and Uhlenbeck ([8], [9]) addressed the problem of setting up the 
quantum virial expansion for the second virial coefficient along the above 
lines, where one needs to consider only the two-body interaction among 
the molecules. In the high temperature regime, where 7 is small com- 
pared to the range of the molecular interaction, quantum mechanical 
effects can be treated as small perturbations, and one recovers the ex- 


pansion obtained in sec. 4.2.2. 


Beth and Uhlenbeck derived, for the sake of simplicity, only the series 
expansion for gu in the partition function (refer to (4-55a), (4-55b)), 
which describes the quantum corrections obtained without regard 
to the symmetry requirements on the state vectors of the system, 
i.e., without distinguishing between bosons and fermions, though the 
basic formula they started from included the more general case as 


well. As seen in sec. 4.2.2, the symmetry requirements give rise to 
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the series Z@) in the partition function, of which the contribution 


of the leading term to the second virial coefficient has been seen to 
be of a higher order of smallness compared to the leading correction 


arising from ZY) (see (4-55b)). 


In the low temperature regime, on the other hand, additional parameters 
are to be taken into account in comparison with kp7, namely, the dis- 
crete energy levels, if any, of the two-body bound state problem defined 
by the potential ¢, and the phase shifts of the corresponding scattering 
problem (in particular, the virtual levels, or resonances assume relevance, 
these being the energies across which the phase shifts change sharply), 
since the second virial coefficient depends sensitively on these parame- 
ters in this regime. At very low temperatures only the lowest bound state 
and the lowest partial wave of the scattering state are relevant, and [9] 
derived approximations to the second virial coefficient in several limit- 
ing situations, starting from an exact expression involving all the bound 
state energies and phase shifts (in this context, see also [67], chapter 10, 
and [86], section 77). As expected, the quantum mechanical expression 
for the second virial coefficient bears no resemblance to the classical one 


in this low temperature regime. 


Indeed, at sufficiently low temperatures, the contribution of the at- 
tractive tail of the intermolecular potential dominates in the expres- 
sion for the classical partition function, and the second virial coeffi- 
cient diverges to —co (reason this out). The quantum virial coefficient 


(i.e., the actual value of the virial coeffient) differs markedly from this 
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and, in commonly occurring situations, is seen to have a large posi- 


tive value, in agreement with experimental findings. 


In summary, one obtains the second virial coefficient in the quantum 
virial expansion for a dilute gas by working out the bound state ener- 
gies and phase shifts for the two-body problem defined by the potential 
o(r). While the two-body problem is solvable in principle, the correspond- 
ing solution for a k-body problem, with k number of particles interacting 


through a two-body potential is not known for k > 2. 


Kahn and Uhlenbeck ([72]) explored the condition under which the equa- 
tion of state given by the virial expansion can imply a phase transition 
at some temperature T when the gas condenses into a liquid, for which 
the pressure becomes essentially independent of volume at the transition 
temperature. In this context, they established that the quantum mechan- 
ical virial expansion was simply a generalization of the classical one, as 
seen in the present section, with the form of the expansion remaining 
unchanged. The condition under which a phase transition results can be 
stated in identical terms for the two cases, and bears analogy with the 


way the Bose-Einstein condensation occurs for a Bose gas. 


4.4 The virial expansion: a brief overview 


On the whole, the classical virial expansion has vindicated itself to a 
substantial extent, when judged against experimental and computer- 
simulated observations. However, it is designed specifically as a per- 


turbative expansion for low densities, and the question arises as to what 
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its range of validity can possibly be. Here, the picture is not so clear. Cal- 
culation of the higher virial coefficients, in so far as they are meaningful, 
indicate that the virial expansion starts to fail quite ahead of the onset of 
condensation, and it has little value in describing the behavior of dense 


gases and liquids. 


A good way to enhance the convergence of the virial expansion is to look at 
its Pade approximants where a Pade approximant, in the present context, 
is a ratio of two truncated power series in p whose coefficients can be 
evaluated by comparing with the original virial expansion up to a certain 
number of terms, but which provides a much better approximation (to 


the ratio ) than the original expansion up to that many terms. Since 


ist 
virial coefficients up to comparatively high orders have been computed for 
the hard sphere model, Pade approximants have been calculated for hard 
sphere gases (and ‘hard disk fluids’ in two dimensions) where it transpires 
that even an approximant involving a ratio of two quadratics in p gives 
a noticeably better fit with data obtained from computer simulation of 
a hard sphere (or a hard disk) gas than the virial expansion up to Bg. 
However, even the Pade approximation is seen to fail quite dramatically 
at relatively high densities where, to all intents and purposes, a phase 
transition (referred to as the Kirkwood transition) occurs in the simulated 


system made up of hard spheres or hard disks. The latter resembles a 


transition from a liquid phase to a crystalline phase. 


From a theoretical point of view, the virial expansion, with the individual 
terms calculated in the thermodynamic limit, makes sense only when 


it converges while, strictly speaking, one needs to obtain the thermody- 
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namic limit of the partition function itself (or, equivalently, of the sum of 


the entire virial series), calculated for any given finite value of V. 


Generally speaking, the ‘sum of limits’ is not the same as the ‘limit of 


the sum’. 


In other words, one has to first sum up the series representing the par- 
tition function at any given temperature for a finite value of V, and 
then check for the existence of a temperature at which the thermody- 
namic functions resulting from this partition function possess a singu- 
larity (such as a discontinuity of (24),., as in fig. 4-6, or of (34) ,,) in the 
limit of large V. A number of useful results in this context, will be out- 


lined in chapter 5. 


4.5 Statistical mechanics of the liquid state 


4.5.1 The liquid state: introduction 


There exits a vast literature, covering both experimetal and theoretical ap- 
proaches, devoted to the liquid state. For an informative and illuminating 
general introduction to the liquid state, covering a wide area, see [64],[23], 
[73], [95], while a good overview of the basic statistical mechanical consid- 
erations applying to the liquid state can be obtained from [117], [66], [101]. 


In this section, I draw from these sources. 


In this book, we will be concerned with the classical description of the 


liquid state. For a liquid made up of particles in the atomic state, the 
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condition for the validity of the classical descrition, which is in the nature 
of a useful approximation, is the same as that for the classical description 


of the gas phase, namely, 
AT << p 3 = (5 (4-59a) 
rT N p 


in a notation by now familiar. For a molecular liquid, one needs addition- 


ally the condition 


2 


h 
Yi <2 kel’, (4-59b) 


to be satisfied, where J stands for the moment of inertia for a rotational 
motion of the molecule, which implies that the discreteness of the ro- 
tational energy evels of the liquid molecules has no role to play in the 
thermodynamic behavior of the liquid. The two conditions stated above 
are found to be satisfied over wide ranges of temperature and pressure 
for most liquids excepting very light ones such as hydrogen, helium, and 


neon. 


The classical approximation entails the great simplification that the effect 
of the translational thermal motion can be clearly separated from that of 
the mutual interaction of the molecules in describing the thermodynamic 
behavior of the liquid, analogous to what one finds in the gas phase. 
The special feature of the liquid phase is that the mean kinetic energy 
Ky (for a system with N particles; see below for a slight alteration in 


the notation) is of the same order as the mean potential energy |Vy|, in 


contrast to the case of a gas (yal << 1) as also to that of a solid (at 
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relatively low temperatures; = >> 1). An alternative characterization of 


the liquid state can be stated in the form 


No? kl 
ee Tg Ae = er ep (4-60) 
V € 
where o stands for the range of the intermolecular interactions (typically, 


the repulsive part), and « for the strength of the attractive interaction. 


Compared to the gaseous phase, the liquid phase is characterized by a 
greater role of collisional processes and, consequently, of short-range po- 
sitional correlations (the solid phase, on the other hand, is characterized 
by long-range order). The short-range correlations in a liquid gives it a 
characteristic structure that is explained, to a large extent by the repul- 
sive core of the intermolecular interaction. As a consequence, a great 
many aspects of the liquid phase are explained by the hard sphere model 


of molecular interactions. 


In the present introductory exposition, we will not consider molecular liq- 
uids where the intermolecular interactions have generally a pronounced di- 
rectional property. Unless otherwise stated, the intermolecular 
potential energy will be assumed to be a sum of spherically 


symmetric pair potentials. 


Relatively simple as it is, the hard sphere model does not explain the 
liquid-gas phase transition, (it does, however, indicate a transition to a 
crystalline phase at p* ~ 0.95 in molecular dynamics simulations) and 


gives an incomplete picture of the liquid state in so far as it does not take 
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into account the attractive tail of the intermolecular interaction. The 
repulsive core and the attractive tail seem to be two distinct but com- 
plementary features of the intermolecular interaction with distinct roles 
in explaining the characteristic features of the liquid state. As seen in 
the van der Waals theory (see sec. 4.1.3.3, and fig. 4-6), both of these 
taken together, in the form of the square well potential, explain the looped 
isotherms below the critical temperature where one has to recall that the 
van der Waals theory itself is deficient in that it takes into account the 


intermolecular interactions only in a mean field approximation. 


Potentials more realistic than the square well model can be generated 
from quantum mechanical calculations (these can also be determined, in 
principle, from atom-atom scattering data) but such calculations gener- 
ally pose formidable problems, especially in obtaining the repulsive core, 
and one commonly makes use of phenomenological models such as the 
Lennard-Jones potential that capture qualitatively a number of features 


indicated in the quantum mechanical theory. 


With this background relating to the liquid state, we turn to the explana- 
tion of the thermodynamic behavior of liquids by employing the principles 


of statistical mechanics. 


As has been mentioned in sec. 4.4, available evidence indicates that the 
virial expansion does not lend itself to making meaningful statements 
about the phase transition of a gas, and makes no sense for systems 
in the liquid phase. This is due to the fact that the individual terms 


of the virial series are obtained only in the thermodynamic limit, while 
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one needs to go over to the thermodynamic limit of the partition function 
itself (evaluated for given finite values of V and N) so as to see when and 
how a phase transition occurs, i.e., to take the thermodynamic limit after 
the virial series (or any other series expansion representing the partition 


function) is summed up. 


Of course, any exact expression for the grand partition function has to lead 
to the virial expansion for the equation of state at sufficiently low densities, 
though the coefficients of the series may not have a graphical interpretation 


in terms of cluster diagrams. 


For a system in the liquid state, the full partition function is out of our 
reach. Interestingly, the full partition function is not really needed to 
describe the thermodynamic behavior of the system since the latter is well 
described in terms of a few distribution functions defined below. True, the 
distribution functions are objects defined with reference to the partition 
function, but the full information content of the partition function is by 
no means necessary to obtain the distribution functions of interest. The 
distribution functions may be made use of for the gas phase as well, 
but there the first few virial coefficients are more conveniently invoked in 
describing the thermodynamic behavior of the system, once again making 


redundant the information content of the full partition function. 


The distribution functions are specifically useful in describing the behav- 
ior of the liquid phase because of the fact that the liquid has, on the 
average, a local structure that is encoded in the distribution functions, 


only a few of the distribution functions being enough to encode the major 
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part of the structure. In particular, most of the thermodynamic behavior 
of a large class of liquids is explained in terms of the two-particle distri- 
bution function, or the pair correlation function that one defines in terms 
of the former, this being so because the thermodynamic functions can be 
related to averages of sums over two-body terms when the potential en- 
ergy of the system breaks up into a similar sum. In contrast, a gas has, 
on the average, no local structure, and the virial coefficients are more 


appropriate for describing its thermodynamic behavior. 


The two-particle distribution function does not, of course, give a complete 
description of the behavior of a liquid, just as the first few virial coeffi- 
cients prove inadequate in the case of a gas. A more complete description 
needs the specification of a whole set of distribution functions, the deter- 
mination of which poses an enormously complex problem. The distribu- 
tion functions satisfy a hierarchy of equations, which one has to truncate 
in some meaningful way to arrive at results of practical relevance. This 


problem will be addressed briefly in sec. 4.5.5 below. 


From the experimental point of view, one can learn about the physical 
properties of the liquid state from macroscopic studies relating to the 
equation of state, i.e., the p-V-T behavior, and also from microscopic 
studies relating to the scattering of radiation from liquids. Both types of 


data can be related to the two-particle distribution function. 
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4.5.2 The distribution functions 


We start by recalling the expression of the classical canonical partition 
function for a system of N particles confined in a volume V at a tem- 
perature T in the absence of external fields. As mentioned above, we 
assume the potential energy function of the system to decompose into 
a sum of two-body spherically symmetric potentials. We adopt the fol- 
lowing notational abbreviation (compare notation introduced in sec. 2.2, 
where the idea underlying the abbreviation remains the same): r!! will 
denote a point (r,,r2,---r,) in the configuration space of the system, while 
(r'%], p!!) will denote a point (r,r2,--- , rv; P1, P2,--: , pw) in the phase space. 
Likewise, d!“Irdl\!p will denote an infinitesimal volume element around 
(r!%], p!‘l), where the former is typically used in an integration over some 
region in the phase space. The corresponding volume element in the 
configuration space will be denoted by d'‘!r. More generally, the super- 
index |N] will be used to indicate that it is some quantity relating to an 
N-particle collection that is being referred to. At this stage of our exposi- 
tion, the meaning of a notation, even when not stated explicitly, will have 


to be read from the context. 


The canonical partition function is then expressed in the form 


¥ / d'“IrdIp exp [-BH™|(r™!, p'®)], (4-6 1a) 
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where H stands for the Hamiltonian function, given by 


N 
HIN|(plM, pIN1) i 
Lint 
Na 
Pi 
= sak p2 o(rij)- (4-6 1b) 
i=1 ae » NywiAZ 


The suffix ‘c’ and ‘g’ in Z., Z, will from now on be dropped whenever the 
context makes it clear as to whether we are talking of the canonical or the 
grand canonical partition function. In the above expression, it has been 
assumed that the total potential energy function #!*! decomposes into a 


sum of spherically symmetric pair potentials, and r;; = |r; — r;|. 


The separation between the momentum and position co-ordinates in the 
classical theory allows one to perform the momentum integration, whereby 
the partition function gets expressed in terms of the configuration inte- 


gral Cl! 


[N] _ — [N] 
Ne 
1 
=a / dlr exp [—86!™I(el%1)). (4-61c) 
a Bs 


Having recalled all this, one can write down the probability for a specified 
set of n (4 N) particles (marked, say, 1 to n) to be found within a volume 
element dr in the configuration space (regardless of their momenta) as 


P'ld'y, where the relevant probability density P!"!(r'"!) is given by 


1 
Pli(rl"l) = rain) fa mata Iniacc- dry exp [—BoM (ri). (4-62) 
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A quantity of greater physical relevance, defined below, is proportional 
to the probability density for any (unspecified) set of n particles to be 
located at points r;,r2,--- ,r, (collectively denoted by r'"!), and is termed 


the n-particle distribution function, 


N 


| 
In] /,[n]}\ _ : [n] (p[r] 


N! 1 
= (Nn) C@l [red tna +d Dry exp [— Bel (r!M1)), (4-63a) 


whose normalization is given by 
/ oll (ntl) dre) — Nt (4-63b) 


(reason this out). The one-particle distribution function p™ is trivial in 
the sense that p“)(r)d“r describes the probability that any (unspecified) 
particle resides in the three dimensional volume element d“)r around the 
point r. For a homogeneous fluid (i.e., one in equilibrium in the absence 


of any external field), one obtains 


pO) 6; (4-64) 


where p (= *) is the number density of the fluid (check this out; make 
use of (4-63a) in (4-62) with n = 1, taking into account the independence 


of p“)(r) on r). 


The two-particle distribution function is of great relevance in the descrip- 


tion and explanation of the thermodynamic behavior of liquids. 
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Making use of the n-particle distribution function p'"!, we define the n- 


particle correlation function g'"! as 


prin!) =pl"(r1, 19, aed Fn) = pg (r1, re, ae sly) 


=prg@"l(rl), (4-65) 


where the notation introduced above for the sake of brevity has been 
invoked so as to make it familiar. The two-particle correlation function 


g”!, also referred to as the pair correlation function, defined as 
pl(ry,r2) = p’g"l(r1, 12), (4-66) 


is of particular interest since, in the case when the \-particle potential 
energy (rll) reduces to a sum of two-particle potentials ()7; ; o(ri;): 
here the sum is taken over all pairs i,j (1 <i < j < N)), it is the pair 
correlation function that determines a number of thermodynamic prop- 
erties of a liquid (see sec. 4.5.3). In virtue of the homogeneity of the 
liquid in equilibrium, g"!(r,,r) depends on r,,r2 only through the differ- 
ence r; — ry. As mentioned above, we will further assume that the pair 
potential ¢ is spherically symmetric, corresponding to the particles mak- 
ing up the system being neutral atoms, in which case the pair correlation 
function g?!(r;,r2) for any two given position vectors r,,r2 will be a func- 
tion of |r2 — r;|. In the following, we will further abbreviate the notation 
by writing g?!(r) as g(r), which is also referred to as the radial distribution 


Junction. 
Incidentally, looking at the definition (4-63a), written for n = 2, and inte- 
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grating over d®)r,;d rj, we obtain 
/ p2ld@r,d®ry = N(N —1), (4-67a) 
and then making use of (4-66) and dividing both sides by p, we arrive at 
[ Arpr?g(r)dr = N, (4-67b) 
0 


where, close to the thermodynamic limit, we have replaced N—1 with N on 
the right hand side. This indicates the following physical interpretation of 
the pair correlation function: considering any particular molecule within 
the liquid, the average number of molecules lying within a distance r 
to r+ dr is given by 4zpr?g(r)dr. In other words, given the location of 


any chosen particle, pg(r) represents the average number density at a 


distance r from it. 


The two-particle distribution function p) can be expressed as the expecta- 
tion value of the product of two delta-functions centered around the loca- 


tions of any two particles, summed over the particle indices, 


pPlir,r’) = 5° (6(r — v)6(r; —1')), (4-68) 
Aj 


(check this out), which is consistent with (4-67a). This definition can evi- 


dently be extended to p!"! for n > 2 (n = 1 corresponds to the density). 


The typical variation of g(r) with r is shown schematically in fig. 4-7 
(graph ‘A’), where it is seen that g(r) falls off to zero at the distance of 


closest approach (the collision diameter o for a hard sphere repulsive 
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core), oscillates with diminishing amplitude for larger values of r (anal- 
ogous to but distinctive from periodic oscillations in the case of solids, 
graph marked ‘B)), and then attains the constant value g(r) =1atr— oo, 
indicating the loss of correlation at large distances. The short range os- 
cillations are indicative of the local structure around any given molecule, 
which is mostly accounted for by the repulsive core surrounding that 
molecule. In contrast, the ideal gas (graph marked ‘C’) has no local struc- 
ture and corresponds to a constant value g(r) = 1 (in a real dilute gas, 
the pair correlation abruptly falls off to zero at r = o due to the excluded 


volume effect). 


The local structure in a liquid, indicative of short-range order, is mostly 
due to the repulsive core of the intermolecular potential. The liquid 
molecules, in virtue of their thermal motion, tend to fill up space, but 
are denied access, to a distance less than the collision diameter, from 
one another. Thus, given any chosen molecule, other molecules in a 
close vicinity tend to achieve a close-packed configuration, while there is 
formed a layer of next-nearest neighbors as well; after a few such layers, 
however, the correlations die down, and the radial distribution function 
g(r) broaden off to value unity. The approximate close packed config- 
uration for the first two layers around any chosen molecule is shown 


schematically in fig. 4-8. 


Distribution functions for dilute systems will be examined in section 5.8, 


where the Kikwood-Salsburg integral equations will be introduced. 
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Figure 4-7: The variation of the pair correlation g(r) with r (schematic) for 
(A) a liquid, (B) a periodic solid, and (c) the ideal gas; in a liquid, g(r) exhibits 
short range oscillations for r > o which level off to g(r) — 1 for r — ow; ina 
periodic solid, g(r) shows periodic oscillations, resulting from long range order; 
for the ideal gas, g(r) = 1 for all r due to a total absence of correlations; at small 
distances g(r) goes to zero if the pair potential between molecules is of the hard 


sphere or square well type; for an exponential or a power law repulsive core, g(r) 
attains a small non-zero value for small r. 


4.5.3 The thermodynamic functions 


4.5.3.1 The internal energy 


In the classical description, the mean energy of N particles in equilib- 


rium in a volume V at temperature 7 decomposes into kinetic energy and 
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Bape ane Perce St LAYET 


eo 
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Figure 4-8: Depicting schematically the close-packed configuration around any 
chosen molecule in a liquid; because of the repulsive core of the intermolecular 
interaction, there is formed a few layers of non-overlapping neighbors, the first 
layer of nearest neighbors being the most well formed; the layer of next nearest 
neighbors is less well formed, while the order is destroyed after a few such layers; 
based on [73], fig. 2.2. 


second layer a 


potential energy parts as 


U = (HN) = (KM) + (eR), (4-69a) 
where the kinetic part (K'"!) is given by 
iy — 3 
(KIM) = SNbeT, (4-69b) 


(check this out; (K!!) is independent of the potential and is the same 
as the mean energy of classical ideal gas; in the thermodynamic limit, 


where U represents the internal energy, one can invoke the equipartition 
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principle to arrive at this formula straightaway). 


On recalling the assumptions stated in 4.5.2, one further obtains (6!!) 
as a sum of aw) terms, each term representing the mean potential 
energy of a pair of molecules (say, the ith and the jth ones; 1 < i,j < 
N;i <j). Since the molecules are all identical, all these mean two-particle 
potential energies are equal, and hence ©!! reduces to aL times the 
mean potential energy between some particular pair, say the one with 
i =1,j = 2. One thereby obtains 


N(N-1) 1 
2 Cm 


(lly — [lard rele — 44))|[drg--- dry exp (— Be (x!M))], 


(4-70) 


(check this out). On making use of the definition (4-63a) (for the particu- 


lar case n = 2), one obtains 


(ol) = [and r0(lr —r,|)pl(r1, 12). (4-71) 


We now invoke the spherical symmetry of ¢ and the definition of the pair 


correlation function so as to obtain 


(0) Sevier [ o(glnydr 


=21Np - r°o(r)g(r)dr. (4-72) 
0 


This result is entirely consistent with the physical interpretation of the pair 


correlation function indicated in the para following (4-67b) in sec. 4.5.2: 
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considering any particular molecule in the liquid, the mutual potential en- 
ergy for all molecules lying within a distance of r to r + dr is 4rpr7¢(r)g(r)dr; 
multiplying with N, dividing by 2 (to correct for double counting), and inte- 


grating, we obtain (4-72). 


One is thus finally led to the following expression for the specific internal 
energy (wu = limy-4. 5) of a liquid for given values of temperature (7) and 


particle density (p = limy_,.. *) in the thermodynamic limit: 


i= (HIM = she + amp | r°o(r)g(r; p, T)dr, (4-73) 
0 


where the dependence of the pair correlation function on the density and 


temperature is indicated. 


4.5.3.2 Pressure 


The pressure in a liquid is given by the expression 


[N] [N] 
OF OlnZ a aati (4-74) 


P= —(ay)aw = be ay — = be ay — 


(check this out; recall that Cl’) stands for the N-particle configuration 


integral). 


In working out the derivative with respect to volume, we observe that, in 
the thermodynamic limit, the thermodynamic functions are independent 
of the shape of the region in which the fluid is confined (provided the 
shape is such that surface effects do not count), which means that we 
can assume the region to be of a cubical shape of edge length L(= V3), 


and vary the edge length so as obtain the derivative. For this, we express 
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the configuration integral as 


L l 
Cm -| af dx, ---dzy exp [—C®(21, 91, 1,°-° Nn, YN, ZN)I, (4-75a) 
0 (0) 


making explicit the Cartesian co-ordinates 21,4, 21,---,¢%N,yn, Zn as the 


integration variables. We now define changed variables 
i= V-3.205,1% — V-3y;,G = V-3z; (i Sly ag yl), (4-75b) 


so that the potential energy ® now involves V explicitly. In terms of the 


new variables, the configuration integral appears as 


1 al 
Ci) — vy f af dis satye oO. (4-75c) 
0 0 


(check this out) where, in terms of the new variables 


oN S° O(Veuiy), wig = ((& — &)? + (mi — 4)? + (G—-G)?)2, (4-75) 


1<i<j<N 


The derivative with respect to V can now be worked out so as to give 


[N] 1 1 . 
oc" =wv" f af deiadtyer? 
0 0 


OV 
1 1 aol 
— N eee eee 
av / / dey = dey 


in which ®!! is to be looked upon as a function of V and the vari- 


een. (4-76) 


ables u;; defined in (4-75d). The derivative ae” therefore evaluates to 


1 Jel] 


_1_,,. ol") ; j j [N 
A <i<jEN Gy3 is “Org where, in the last derivative, ® 


| is again looked upon 


as a function of the old variables, i.e., the components of r,,--- ,rjy, in 
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terms of which 7;;(= V3U;;) = |r; —r;| (1 <i <j < N). The summation over 


i,j involves Nw) terms, all of which contribute equally to the second 
integral in (4-76) which allows us to consider the contribution of just a 
N(N-1) 


single term, say, the one with 7 = 1,7 = 2, and then multiply with —,—. 


One can now transform back to the old variables where 6!" appears as a 


function of the r;,’s, giving 


N 
ae az [dnd rye Pe 
N(N —- 1) 


O 
| dry --- dry rg SM 8" (4-7) 
B 12 


We now make use of the definition (4-63a) (with n = 2) and divide both 


sides of (4-76) by C'!, so as to obtain 


dncWl N 1 O¢(r12) 
= (3) ,_ q(3) 12) [2| 4-7 
AV V 6V ket fe rid Yori2 Oris p (T12). ( 8) 


Finally, making use of the last equality in (4-74) and the definition of the 
pair correlation function in terms of p?!, one obtains the pressure equation 


p = pkgT — =e | BAU ee a (4-79) 
3 0 dr 


Once again, we find that, under the assumption of two-particle pair in- 
teractions, the thermodynamic behavior of a liquid is determined by the 
pair correlation function, for which the dependence on p and T have been 
made explicit (at a more fundamental level, though, the two-particle po- 
tential ¢(r) determines everything, including g(r) itself). The above for- 


mula segregates the kinetic and the potential contributions to the pres- 
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sure, and can be read as the equation of state of a fluid, being valid for 
both the liquid and the gaseous phases. In the case of a dilute gas, one 


can expand the pair correlation function as a power series in p as 


g(r; p,T) = gol(r;T) + pgilr;T) + p?g2(r;T) +-°- (4-80) 


and, on substituting in (4-79) and comparing with the virial expansion (4-2), 
obtain the virial coefficients B,, B3,---. In particular, the second virial co- 


efficient is obtained as 


-_ 2m ° 3d¢(r) 
BT) =—55 | r pa go(r, T)dr. (4-81) 


Evidently the fact that the total potential energy can be decomposed into 
a sum of isotropic two-body potentials explains the thermodynamic rele- 


vance of the pair correlation function. 


4.5.3.3 Isothermal compressibility 


The isothermal compressibility is one of many thermodynamic static re- 


sponse functions, and is defined as 
KT = —-(=-),. (4-82) 


This can be related to the particle number fluctuation, obtained from the 
grand canonical partition function. 
(N2)—(N)? 


The relative fluctuation in the particle number, defined as a ci 
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goes to zero in the thermodynamic limit. On the other hand, i ale 


tends to a finite limit, and is related to the isothermal compressibility, 


thereby providing us with a statistical interpretation of the latter. 


Making use of the thermodynamic formulae 


Q=—pV, dQ = —pdV — SdT — Nedp, (4-83a) 


one obtains, at constant temperature (i.e., with V and yw as independent 


variables) 
Nd = Vdp (T = constant), (4-83b) 


(check this out) from which we get 


Op Ou 


“Se = Gale = Te 


le =: (4-83c) 
Cae 


On the other hand, making use of the definition of 7,, one obtains 


O(N) 


(N?) —(N)? = Bay vr (4-84) 


(check this out as well, making two successive differentiations with re- 


spect to at constant V,7), which gives (making use of v = a) 


(N*)—(N)? _ keT O(N) 
(N)2 (NY? Op 
kpTV Ov 


= TN} o? ys" (4-85) 
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Formulas (4-82), (4-83c) then give 


kpT (N?) — (N)? 


— Kp = 
U 


(4-86) 


which is the formula we were after. In deriving this, we have assumed 
the existence of derivatives of the relevant thermodynamic variables (in 
particular, of the pressure p), which is consistent in the absence of phase 
transitions. In other words, formula (4-86) holds under conditions away 
from a phase transition, and tells us that the pressure is a decreasing 


function of the specific volume under such conditions. 


In formula (4-86), the left hand side is made up of thermodynamic func- 
tions while the right hand side, which makes sense in terms of statistical 
mechanics, can be related to the pair correlation function. Since it in- 
volves fluctuations in the particle number, one has to refer to the grand 
partition function, which tells us that the probability for the particle num- 


ber to have the value N is 


FIN gheN (4-87) 


(reason this out). It is this probability that acts as the weight factor for 
working out mean values of quantities depending on the particle number 
N (along with other relevant parameters, which we do not explicitly refer 
to). For instance, for a quantity A(N),which is defined in the canonical 


ensemble, the mean value in the grand canonical ensemble (referred to 
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by the sub-index ‘g’) is given by 
(A(N)), = $5 P(N)A(N). (4-88) 
Thus the n-particle distribution function p!"! now appears as 


pri(ntly = (oN rhl))e, (4-89) 


where the notation pl (rinl) is used to denote the n-particle distribution 


function previously defined with reference to the canonical ensemble for 
a fixed N, whereas the distribution function p'”! appearing in the left hand 
side of the above formula is to be defined as being proportional to the 
probability of n particles to be located at r'”!, regardless of the particle 
number N and of the other particles of the system. The normalization 
of this distribution function is (compare formula (4-63b) in which the 
distribution function occurring in the left hand side is actually pl in our 


present notation) 
N! 


Thus, substituting n = 1 and n = 2 in turn, we obtain (using a slightly dif- 
ferent and more familiar notation for an average over the particle number, 


i.e., dropping the sub-index ‘g’) 


[oMendr = (N), (4-91a) 
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[een rod r,d@ rz aa (N? IN as (4-9 1b) 


the first of these two formulas being nothing but a restatement of (4-64) 
(recall that quantities defined with reference to the canonical ensemble 
necessarily agree with those defined with reference to the grand canon- 
ical ensemble in the thermodynamic limit). On making use of (4-66) 


in (4-91b), the above two formulas yield 


1+ pf ®x(a(r -1)= TN (4-92a) 
i.e., finally (using (4-86)), 
pkpT Kr =1+ pf ®x(a(r —1). (4-92b) 


This is referred to as the compressibility equation. 


On reviewing how the compressibility equation is arrived at, you will find 
that the derivation does not require the pairwise additivity of the poten- 


tial energy function 61%! 


, which makes it an exact formula expressing 
a thermodynamic function in terms of the pair correlation function, in 
which one can replace g(r) with the more general g(r) (see para following 


eq. (4-66)). 


Though the thermodynamic functions such as the internal energy, pres- 
sure, and the isothermal compressibility can be expressed in terms of 
the pair correlation function, the latter does not determine all the ther- 


modynamic functions of the system under consideration. For instance, 
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the chemical potential of the liquid cannot be expressed in terms of g(r), 
p, and T by means of a closed expression. An expression, however, can 
be obtained in terms of a modified distribution function g(r;€), where € 
is an additional ‘tuning’ parameter by which the interaction of some cho- 
sen molecule (out of all the N molecules making up the liquid) with all 
the other molecules is assumed to be scaled (see [111], [66], [101]), and 
where g(r;€ = 1) coincides with the unmodified pair correlation function 
g(r). The dependence of g(r;€) on € can be determined only in terms 
of higher order correlation functions. In modification of our earlier state- 
ment we now assert (without proof!) that all the thermodynamic functions 


of a liquid can be determined in terms of, p,7T and g(r, ¢). 


4.5.4 The structure factor 


Fig. 4-9 depicts schematically the scattering of a beam of radiation from 
an atom in a liquid. A beam of X-rays characterized by a wave vector 
k (\k| = 3) is scattered elastically from atoms such as the one located at 
A (position vector, say, r;; the index 7 distinguishes between atoms in the 
liquid sample), and the scattered beam, characterized by a wave vector 


k’ (|k’| = |k|) is observed at an angle @ with the incoming beam. 


If A denotes the amplitude of the incoming wave, then the scattered am- 


plitude is of the form (see, for instance, [19]) 


Ax Af(ge "4" (q=k' —k). (4-93) 
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sample~_ 


forward direction 


Figure 4-9: A typical X-ray scattering set-up (schematic); an incident beam with 
wave vector k is scattered in a direction 0 from atom A in a scatterer sample, with 
wave vector k’; considering all the atoms in the sample, the ensemble-averaged 
scattered intensity in the direction @ is given by (4-97b), where F'(0) is the atomic 
scattering factor, having the same value for all the scattering atoms, while the 
short-range structure of the liquid, as revealed in fig. 4-7, is responsible for the 
factor S(q) (formula (4-97a)), referred to as the structure factor; the latter is 
related to the pair correlation function by a Fourier transform. 

We assume that a plane wave is being scattered elastically from the 

atoms of the liquid, and the intensity of the scattered radiation is 

observed at a sufficiently large distance from the scattering sample. A 

slow variation of the scattered amplitude associated with the inverse 

square law of intensity is ignored compared with the rapid variation 


of the phase, and an over-all phase factor is ignored as being of no 


consequence in determining the observed intensity. 


The above expression of the scattered amplitude involves two factors of 
relevance: (a) the atomic scattering factor f(q), which depends on the 
interaction of the incident beam with the electron density distribution in 
the atom (the effect of the atomic nucleus on the scattering is negligible, 
because of the relatively large mass of the nucleus), and determines the 
angular distribution of the scattered radiation in the immediate vicinity 


of the scatterer, being identical for all the scatterers in the sample and 
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(b) the phase factor e~'*" arising due to the optical path to and from the 
scatterer, which also contributes to the angular distribution. The vector q 
represents the difference of the outgoing and the incoming wave vectors, 


and can be written as 
q = — sin -q, (4-94) 


(reason this out). The scattered amplitude resulting from a distribution 


of NV atoms in the scattering sample is given by 
N 
Oy eae, (4-95a) 


where we have written F(@) in place of f(q) to indicate the angular de- 
pendence of the atomic scattering factor. The scattered intensity in the 


direction @ (see figure) is then given by 


N 
1'(8) = In| F(A)? YS) etre, (4-95b) 


j,k=1 


where Jj is an appropriate constant of proportionality, which is not of 


relevance in the relative intensity scattered in the various directions. 


Since the atomic position vectors r; (j = 1,2,---) are all subject to ther- 
mal fluctuations, the observed intensity will be an average of the above 


expression, and is be given by 


(0) = Ip|F (0) a ea (Fi—Tk)) (4-96) 


j,k=1 
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where the angular brackets denote an ensemble average which, in the 
present context, is an average in the canonical ensemble. In the case 
of dilute gas, the position vectors are uncorrelated, and the scattered 
intensity is approximately a constant, being independent of the scattering 
angle. In contrast, the position vectors are strongly correlated in a solid, 
especially at low temperatures, and the scattered intensity shows sharp 
peaks distributed regularly as a function of the scattering angle. The 
situation in the case of a liquid is of an intermediate nature since the 
liquid molecules (we have been considering atomic liquids for the sake 
of simplicity) have a short-range correlation, which is destroyed at larger 
distances. The scattered intensity shows peaks in a diffuse background, 


where the peaks lose their definition at relatively high temperatures. 


Defining the structure factor S(q) as 


N 


1 
= S —1q:(t3—Tr) 2 


we can express the scattered intensity in direction @ as 
1(0) = NI|F(9)|?S(q), (4-97b) 


where the factors x and N in (4-97a), (4-97b) are inserted in acknowl- 
edgement of the total number of scatterers in the scattering sample, so 
that the structure factor depends only on intensive variables. An anal- 
ysis of the scattering data gives us the structure factor which can now 


be related to the pair correlation function by noting that the double sum- 


mation over j,k can be broken up into a part with j = k (N terms, each 
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of value 1), and another with 7 4 k (N(N — 1) terms, all equal, which we 


evaluate by choosing j = 1,k = 2), so as to give 


NIN =I —iq:(r1—re 
S(q)= +A a-( )) 
= +a f eOnpPajen™, (4-98a) 


(check this formula out; make use of the definition (4-63a), with n = 2). 


Finally, introducing the pair correlation function g(r), one obtains 
S(q)=1+ p f d®rg(rje, (4-98b) 
which we write as 
Sta) —1=p f dx(9(r) — Ne + @n)*05(a), (4-980) 


(check this out) which tells us that the structure factor is directly related 
to the Fourier transform of the pair correlation function. The last term on 
the right hand side assumes relevance only for scattering in the forward 
direction (k = k’), and corresponds to the un-scattered part of incident 
beam of radiation. As such, the structure factor S(q) is commonly defined 
with this term omitted. In other words, discounting the un-scattered 


incident beam one has 


s(a)—1=p f @ (g(r) —e** = Aap [tyr - ar, 4-98) 
0 


(check this out as well) where the last formula shows that S(q) depends 


on the magnitude (q¢ = |q|) of q alone. 
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Summarizing, an analysis of the scattering data gives us the radial dis- 
tribution function g(r), which is obtained by means of an inverse Fourier 
transform from the structure factor. Pair correlation functions for a large 
number of liquids have been obtained experimentally under a wide range 


of values of p,T in this manner. 


Neutron scattering experiments with slow neutrons provide a better means 
of determining the pair correlation function of liquids since neutrons are 
scattered mostly from the nuclei because of their comparatively large 
mass, and slow neutrons are affected little by the nuclear strong interac- 
tion forces, as a consequence of which the scattering factor F'(@) is almost 
a constant, thereby exerting little masking effect in the determination of 


the structure factor from the scattering data. 


Fig. 4-10 depicts schematically the variation of the structure factor S(q), 


related to the pair correlation function g(r) as in (4-98d). 


Incidentally, the structure factor S(q) at g = 0 is related to the isothermal 


compressibility «7, as implied by formulas (4-92b), and (4-98d): 


S(0) => pkpl Kr. (4-99) 


This formula provides us with a method of determining the large wave- 
length limit of the structure factor from the experimental determination 
of the compressibility (a large negative quantity for most liquids), since a 
determination of 5(0) from scattering data poses a difficult problem owing 


to the masking effect of the un-scattered part of the incident beam. 
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Sq) 


Figure 4-10: Depicting the variation of the structure factor S(q) of a liquid, 
related to the pair correlation function by a Fourier transformation as in (4-98d); 
the successive maxima of S(q) are indicative of the short range order in the 
liquid; the maxima broaden out to a constant value for large q because of the 
long-range disorder in the liquid. 

Having seen how the pair correlation function (or the radial distribution 
function, as it is commonly referred to) g(r) determines the thermody- 
namic behavior of a liquid (under the assumption that the intermolecular 
potential energy can be decomposed into a sum of two-particle poten- 
tials), and how it is related to the structure factor that can be obtained ex- 
perimentally from scattering data, we now turn to the theoretical schemes 


underlying the calculation of the pair correlation function in terms of the 


intermolecular potential. 
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4.5.5 The BGY hierarchy 


We start from the definition of the n-particle distribution function p!”! (in 


the notation introduced in 4.5.2) 


N! 1 
[n] (fel) — fF q®) sie G@ ego Poe) c 
pra) = (NV —nicm je Inii:::d‘’ rye : (4-100) 
We now take the vector derivative of this equation with respect to r, (we 
denote the corresponding differential operator by V;; any one among the 
n vectors r'"] could be made use of instead of r,; to perform the operation 


of derivation). On the right hand side, we decompose 6!) (r!!) in terms of 


pair potentials as 
N 
OM(rlMl) = So diri)+ YS) o(tiry). (4-101) 


i=2 W<i<j<N 


The derivative with respect to r; acts on the first part in this sum, giving 


N 
— p61 (rl] 
, [bred eno Viole nie pore). 


1=2 


Viphl(rl) = —B 
(4-102a) 

; é N 
The sum under the integral sign can be further broken up as )°._, = 
yo +o. Since the variables r,,--- ,r,, are not involved in the inte- 


gration, we can bring the sum )>""., Vi¢(ri,r;) outside the integral, which 
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gives 


Vi eM (rll) = —Bpl I(r!) > V1i0(r1,%i) 


1=2 


= oe (3) (3) . — pel] 
Pam (N—n)! je Engi? dry > Vid(11, Vie 


i=n+1 


(4-102b) 


The last sum on the right hand side gives, on integration, N — n number 


of equal terms, and can be expressed as 


N! 1 
(N —n—1)! CIN 


=f yf d? nny Vid(41, rn+1) | i] ice ae d® rye Fo 


One can then make use of the definition (4-100), written with n+ 1 re- 


placing n, to obtain 


Vie (14,12, °°° i) = —Bpl"l( (r1,Yo,° ro) V8 r1,1;) 


si tasinas tia . + Tnit)- (4-103) 


This equation, referred to as the Born-Green-Yvon (BGY) hierarchy ([73], 
chapter 8, [23], chapter 2), gives p'"! in terms of p!"t!, Thus, one has to 
know p"! in order to solve for p”!, and then to know p"! in order to solve 
for p31, and so on. In other words, the hierarchy of equations represented 
by (4-103) is not a closed one, and one has to look for ways to close 
it in some scheme of approximation. Since it is the radial distribution 
function that is commonly the object of interest, one needs, in particular, 
to obtain a closed set of equations for p?!(r1,r2) = p?g(\re — ri|) (assuming 


spherical symmetry). The equation for g(r) that one obtains from (4-103) 
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(by putting n = 2) reads (recall the relation p?! = pg) 


—B"V1g(|r2 — r1|) = 9(\r2 — ri|)Vid(lr2 — ri) + p fd reV.0(ln —r3|)g\ (ry, r2, r3). 


(4-104) 


1. The equation(4-103) can be generalized by adding a term that 
gives the effect of an external field, if any, on the radial distri- 
bution function. Among other things, the external field destroys 
the spherical symmetry of the radial distribution function. In the 
present exposition, in which I focus mostly on basic principles, 


the external field is not considered. 


2. Along with (4-104), one needs to consider the equation preced- 
ing it in the hierarchy, i.e., the one obtained by putting n = 1 
in (4-103). However, both sides of that equation vanish identi- 
cally for a liquid in equilibrium in the absence of external fields 


(check this out). 


3. The BGY equations constitute the equilibrium version of a hi- 
erarchy of equations describing the evolution of time dependent 
distribution functions in a fluid, where an external field is in- 
cluded for the sake of generality. This more general set of equa- 
tions is commonly referred to as the Bogolyubov-Born-Green- 


Kirkwood-Yvon (BBGKY) hierarchy (refer to section 8.3.4). 


The most commonly adopted strategy for converting (4-104) to a closed 


equation is the one of assuming, by way of an effective approximation, 
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the Kirkwood prescription: 


g®l(r1, r2,73) = g(P1,T2)9(T2,63)9(r3,r1), (4-105) 


commonly referred to as the superposition approximation. It means that 
the correlation between particles located at points r,;,r2 is independent of 
a third particle located at, say, r3, as a result of which the three-particle 
correlation is the product of three two-particle correlations. Introducing 
this approximation in (4-104), one obtains the closed integro-differential 


equation 


—B'V, In g(T12) a Vi0(ri2) a p | dre¥.0(ra1)alra)9(r2s) (4-106) 


(where T1y2 = \ro—Y1 


, etc.). One obtains, in principle, the radial distribution 
function g(r) by solving this equation, subject to the boundary condition 
g(r) > 1 for r > oo. In practice, numerical solutions are fruitfully resorted 


to. 


Strictly speaking, the superposition approximation is tenable for a fluid 
at low densities or, when one among a group of three particles is at a 
large distance from the remaining two. However, in reality, it is found 
that (4-106) describes the behavior of a fluid over a reasonably wide range 
of physical conditions. In particular, using a hard sphere potential for 
g(r), one finds that the above approximation describes a liquid right up 
to the value of p (for a given value of 7) at which a phase transition to 


a crystalline configuration takes place (this is a transition specific to the 
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hard sphere fluid, as observed in molecular dynamics simulations, and 


is referred to as the Kirkwood transition). 


There exists a vast literature on approximation schemes (ones related to 
the Kirkwood approximation and also a number of others distinct from 
it) for the theoretical determination of the radial distribution function. In 
the present introductory exposition we include, in sec. 4.5.6 below, only 
a few basic considerations relating to what is referred to as the direct 


correlation function. 


In the context of the BGY hierarchy, mention is to be made of the Kirk- 
wood integral equation, which results from an alternative and more explicit 
approach to the determination of the radial distribution function. In this 
approach ([66], chapter 6, [23], chapter2), one introduces an additional pa- 
rameter describing a variable coupling strength (¢) between a chosen particle 
and the other particles in the liquid, and ends up, on making the superpo- 
sition approximation (4-105), with an integral equation in terms of ¢. The 
actual solution is obtained on putting € = 1, while € = 0 corresponds to 
the chosen particle being decoupled from the rest of the system, thereby 


providing for a boundary condition for the solution of the integral equation. 


4.5.6 The direct correlation function 


A highly effective approach to the liquid state is by means of the direct 


correlation function (c(r), see below) introduced by means of the formula 


A(rie) = g(ri2) —1 = c(rie) + p | a rah(ras)e(ra). (4-107) 
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Though no more than a definition of the function c(r), this equation is 
of relevance because the direct correlation function it introduces is pos- 
sessed of a substantial physical significance that sets it apart from the ra- 
dial distribution function g(r) or the related function h(r) (which is a more 
appropriate indicator, as compared to g(r), of the extent two-particle cor- 
relation in a fluid since it goes to zero as r > ov). It tells us that the two- 
particle correlation expressed by h(r) (referred to as the total two-particle 
correlation function) is made up of two parts - a direct part expressed by 
c(r) and an indirect part, mediated by other particles in the fluid (second 


term on the right hand side above). 


It is to be mentioned that h(r) is the total two-particle correlation func- 
tion (it derives from p?l(r) = p?g(r)), and hence c(r) is a measure of 
the direct two-particle correlation; one can, in an analogous manner, 
look at the direct correlations involving three, four, or more particles 
and, for each of these, indirect correlations are also to be considered. 
However, the two-particle correlations (total, direct, and indirect) are 


of overriding interest for a wide range of physical conditions. 


Formula (4-107) is referred to as the Ornstein-Zernike (OZ) equation and 
constitutes the basis for a fruitful approach to the study of the liquid 
state since the direct correlation function (c(r)) it defines has a simple 
non-oscillatory structure as compared with g(r) (an oscillating function), 
is amenable to a comparatively more precise experimental determination, 
can be obtained theoretically in terms of the pair potential by means of 


a number of useful and efficient approximation schemes, and provides 
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substantially relevant information regarding the pair potential itself. 


Fig. 4-11 compares the structure of the direct correlation function c(r) 
with that of the radial distribution function g(r). While the latter is seen 
to be characterized by a long range (of the order of the distance up to 
which molecular correlations are established) and is oscillatory in nature, 
with a few well defined peaks, the former is non-oscillatory and of short 


range (of the order of the range of the intermolecular potential). 


Figure 4-11: Comparison (schematic) between the direct correlation function 
(c(r)) and the radial distribution function (g(r); in the figure, g(r) — 1 is plotted 
against r) of a liquid; while the direct correlation function is non-oscillatory and 
of a short range, the radial distribution function is of a more complex nature; 
based on [101], fig. 13.5. 


A Fourier transformation of the two sides of the OZ equation gives the 


following simple relation between the Fourier transforms, h(q), ¢(q) of the 
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total and the direct correlation functions (h(r), c(r)): 
h(q) = eq) + ph(q)e(q), (4-108a) 


(check this out; make use of the result that the Fourier transform of a con- 
volution of two functions is the product of their Fourier transforms). Fur- 
ther the formula (4-98d) relates the Fourier transform h(q) to the struc- 


ture factor that one determines experimentally. One thereby obtains 


“(q) <1 — 1 : 
pe(q) = 1 59) (q #0), (4- 108b) 


which gives ¢(q) directly in terms of S(q). A number of liquid state features 
are related to ¢(q) instead of c(r), which explains partly the relevance of the 
OZ equation (in contrast, the radial distribution function is determined 
by taking the inverse Fourier transform of the structure factor, whose 
oscillatory nature makes such a transformation open to uncertainties). 
In particular, the long wavelength limit (¢ — 0) of ¢(q) can be determined 


reliably from compressibility data (refer to formula (4-99)). 


There exist a number of approximation schemes by means of which the 
direct correlation function can be estimated theoretically with consider- 
able precision. These differ from the virial expansion in being applicable 
to moderately high densities characterizing the liquid state, and agree 
with the virial expansion at low densities. Generally speaking, the ap- 
proximation schemes are based on closure formulas for the OZ equation 
whereby the latter, involving two unknown functions (h(r) and c(r)) is con- 


verted into an integral equation for one unknown function, c(r). Having 
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obtained c(r), one can work out h(r) (and hence g(r)) from the OZ equation 


itself. 


It is important to appreciate the difference between the density expansion 
(i.e., the virial expansion) approach, that can be employed to work out a 
virial expansion for g(r) in terms of cluster integrals as in sec. 4.1, and 
the approximation schemes (a number of these are briefly outlined below), 
mentioned above, based on closure formulae for the OZ equations. Since the 
latter are valid at relatively high densities, these can be invoked to obtain 
higher order virial coefficients in the density expansion formulas. It is in 


this respect that the OZ equation gains relevance. 


The virial expansions for g(r) looks like (refer to [66], chapter 6), 


g(r) = PML +p f de’ Fle’ — r) F(e)) +] (FP) = PM —), 


(4-109a) 


from which one can derive the virial expansion for c(r) where, in the zeroth 


order, one finds 


e(r) ~ e800) _ 1, (4-109b) 


We express the virial expansion of c(7r), which begins with the zeroth order 
term co(r) = e~ 9%”) — 1, in the form 


c(r) = co(r) + per(r) + p?ea(r) +++, (4-110) 


where each of the cefficients c;(7) (r = 0, 1,2,---) can be expressed in terms 


of a set of cluster integrals. Two of the more widely employed approxima- 
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tion schemes for the evaluation of the direct correlation function, namely, 
the Percus-Yevick (PY) approximation and the hypernetted-chain (HNC) 
approximation are based on a systematic omission of sets of diagrams 
arising in c;(r) for i > 2, and are effectively described by means of the fol- 
lowing closure formulas that convert the OZ equation into closed integral 


equations: 


[PY closure formula :] c(r) = (1 — e?%”)(A(r) + 1), (4-11 1a) 


[HNC closure formula :] c(r) = h(r) — Bd(r) — In(h(r) + 1). (4-111b) 


Among the two, the PY closure formula discounts a larger set of diagrams 
in c, but, in the end, turns out be a better approximation for liquids for 
which the repulsive hard core is of overriding relevance. The solution to 
the PY equation can be worked out exactly in the case of a hard sphere 
interaction and is found to give an excellent description of the behavior 
of a number of liquids over varied physical conditions. The equation 
of state obtained in this case from the compressibility equation (4-92b) 
reads ( [23], chapter 2) 
p Lea 


= 
= = 4-112 
iT p a—s (n rid p); ( a) 


while that worked out from the pressure equation (4-79) is 


p _ 1+n+7*- 3? 


4-112b 
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where o stands for the radius of the hard sphere core of the pair potential. 
The HNC scheme is, in a sense, complementary to the PY approximation 


since it works well in the case of pair potentials with an attractive tail. 


The direct correlation function, which can be calculated quite precisely in 
analytical and numerical schemes, is of relevance in the study of the gas- 
liquid critical point where it is made use of in the description of critical 


fluctuations. 


4.5.7 Perturbation theory for liquids 


The hard sphere model (or one where the infinitely sharp repulsive core 
is replaced with a steep but continuously varying one) has been found 
to be remarkably successful in describing the short range structure of 
liquids arising out of molecular correlations where, in the long range, the 


correlations die out. 


The length scale characterizing the repulsive core, i.e., the ‘collision diame- 
ter’ o, differs substantially from the length scale over which the correlations 
disappear. In a simple-minded hard-sphere model, the pair potential is as- 
sumed to be infinitely large for r < o while, in more realistic models such 
as in the Lennard-Jones potential, the pair potential is assumed to increase 
sharply, but continuously, as r is made to decrease through o. Further, o 


itself depends on the pressure and temperature, though to a small extent. 


A very fruitful approach to the liquid state is the one of starting off from a 


model with a strongly repulsive intermolecular interaction and then tak- 
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ing into account the effect of the relatively long range attractive potential 
as a perturbation. This approach is to be contrasted with the perturbation 
approach (the virial expansion) that starts from the ideal gas (zero cor- 
relation) and then introduces the correlations as a perturbation, thereby 
arriving at an approximate equation of state for a dilute gas. In the per- 
turbation theory for liquids, correlations are introduced in a major way 


from the very beginning via the model described by the repulsive core. 


The attractive ‘tail’, as it is commonly referred to, of the pair potential 
has the effect of providing for a bulk cohesion in a mass of the liquid 
where, in the leading order of the perturbation, each molecule may be 
assumed to feel a uniform confining potential (i.e., one without fluctua- 
tions) which yields, precisely, the van der Waals equation of state. The 
latter, in other words, constitutes an instance of the perturbation theory 
for liquids mentioned above. More sophisticated perturbation theories of 


liquids have also been devised. 


The general framework of the perturbation theory (refer to [101], [73]) of 


liquids may be briefly outlined here. 


The total intermolecular potential energy © is decomposed as 


$= 604 6", (4-113) 


where ©) refers to the model, described by the repulsive interaction 
alone, that one starts from and ®’ to the perturbing potential (defined as 


the difference between the actual potential ® and the reference potential 
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61%), 


The configuration integral (it is understood that we consider an -particle 


system) can then be expressed in the form 


oJ dlpeB® 88 


— ol 
Bose f dNlre-s2e 


ek OC Cmca (4-114) 


(check this out) where (e~°®)) stands for the average of e~°® in the un- 
perturbed ensemble. The total free energy F' of the system can then be 


broken up as 
F = FO + F’, (4-115a) 


where F©) = —6-" In ( CO) is the free energy of the unperturbed sys- 


con eee 
NDSN 


tem, and F’ = —6~'In(e~**’)y represents the effect of the perturbation. We 


expand e~*" as 


—BF' —~ Bo’ < (—8)" 
en Serr = 


nN? 


((®)")o, (4-115b) 


n=0 


which implies a corresponding expansion of F” as 


yO, ane 
F= Done) - (4-115¢) 
where 
ay = (®')o, ay = (89 — (®')?...- (4-115d) 


In other words, one has the following high temperature expansion (i.e. ex- 
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pansion in powers of (; in reality, the expansion parameter is Se, where 
e is a parameter that characterizes the strength of the attractive interac- 


tion) of the free energy of the system: 


On, ayn 
FaPO 4 So 8-9 


n=1 


=F) + (Bo — = 


Qkpl 


‘o 


[(®)o — (®')8) te. (4-116) 


Once the free energy is calculated up to any given order by working out 
the various moments of the perturbing potential ®’ in the unperturbed en- 
semble, the equation of state is obtained to that order from the formula 
p = —($),. The actual working out of this scheme, however, involves 
technical intricacies that lies outside the scope of this introductory ex- 
position. However, we indicate below how one obtains the van der Waals 
equation of state in the first order of the perturbation theory under the as- 
sumption that the reference system is an essembly of hard spheres with a 
fixed collison diameter co, independent of pressure and temperature, and 


then conclude this section by mentioning how the van der Waals result 


can be improved upon. 


As for the van der Waals equation of state for a liquid, one assumes that 


®’ can be broken up into a sum of pair potentials of the form 


® =)  d(ris), (4-117) 


t<j 


where ¢ represents the attractive part of the total two-particle interaction, 


of which the repulsive part has been assumed to have been incorporated 
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in describing the unperturbed state of the system. The exact form of the 


attractive potential is not relevant here. On expressing (®’) as 


(B')o = S_(A(Tis))o- (4-118) 

i<j 
we make use of the homogeneity of the system in the unperturbed state 
(the homogeneity is maintained in the perturbed state as well) and of the 


definition of the radial distribution function, so as to arrive at 
1 
(®)0 = SNe i d rg (r)d(r), (4-119a) 


where g(r) stands for the radial distribution function defined for the 
reference state (check the above equation out; you can either make use 
of the definition of the two-particle distribution function p”!, in terms of 
the unperturbed configuration integral, or obtain the result more directly 
by recalling the physical interpretation of the radial distribution function 
- refer to paragraph following eq. (4-67b); the factor of $ comes in as the 
restricted sum (i < j) is converted to an unrestricted one). Since ¢(r) is 


negative for an attractive potential, we can write the above formula as 
(By =—aNp, a=2n rg (rjlo(r)idr, 41198) 
0 


where a is a positive constant. 


It now remains to work out the free energy F) of the hard sphere system, 
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described by a pair potential 


d (r) =c0 (r <a) 


=0 (r>o), (4-120) 


where o stands for the collision diameter, i.e., the range of the repulsive 


core. The canonical partition function for this potential is given by 


i 


(0) _ 
NDSY 


(V=We)", (4-121) 
(check this out) which is simply an expression of the excluded volume 
effect due to the hard core repulsion. On working out the unperturbed 
free energy F from this and substituting the result, along with that 
of(4-119b) in (4-116), one obtains, up to the first order of perturbation, 


1 
F=-8"'ln lyn Y — Nb)*] —aNp. (4-122) 


The van der Waals equation (4-39b) follows from this (check this out). 
You will have noticed that the derivation here is essentially the one out- 
lined in sec.4.1.3.3, where the van der Waals equation was derived in the 
mean field approximation invoked along with the excluded volume effect, 
though the present derivation refers to a different context (that pertaining 


to the liquid state). 


Barker and Henderson improved upon the van der Waals result on two 
counts: (1) by making use of a more flexible model for the unperturbed 


system where the repulsive core of the potential is an effective hard 


364 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


sphere model with a range that involves a number of adjustable parame- 
ters (this model resembles a number of features arising from the Lennard- 
Jones potential), and (2) by going over to the second order of perturbation 
where, in turn, an additional approximations was invoked (they consid- 
ered two distinct routes for this additional approximation, and obtained 
equivalent results from the two). The attractive part of the two-particle 
potential was assumed to be of the square well type, characterized by a 
range and a strength. The results obtained from their model were seen 
to be in excellent agreement with molecular dynamics and Monte Carlo 
simulations for liquids over a wide range of physical conditions. Other 
derivations of the equation of state in the perturbation theory, with simi- 


lar results, also exist in the literature. 


Concluding remarks. 


The present section (sec. 4.5) gives only a very brief and incomplete ac- 
count of the statistical mechanics of liquids in equilibrium, but it includes 
a number of basic ideas necessary for a more complete description. A 
major topic left unexplored is the liquid-vapor phase separation and the 
analysis of surface phenomena. Though a true phase transition cannot 
be explained in terms of the theoretical approaches to the liquid state 
outlined in the above paragraphs, the possibility of two distinct phases 
is contained in the formula (4-79) which indicates two distinct values of 
the density p for given values of p, T (ignoring for the moment the den- 
sity dependence of the radial distribution function g). The liquid-vapor 
coexistence curve can be obtained from the condition of the quality of the 


chemical potentials for the two phases, but the equilibrium separation 


365 


CHAPTER 4. STATISTICAL MECHANICS OF INTERACTING SYSTEMS I 


between two phases and a surface of separation can be established only 
in the presence of an external field that breaks the homogeneity of a liq- 
uid or vapor mass. The entire theory of the liquid state can be worked 
out by including an external field, when the various quantities of interest 
appear as functionals of the density distribution p(r). This more general 
and powerful approach to the liquid state is outlined in [64], [95], while a 


more direct and physical approach is to be found in [23], [73]. 


The equilibrium configuration of a gas or a liquid appears as a limiting case 
of more general non-equilibrium configurations where, generally speaking, 
a non-equilibrium configuration passes over asymptotically to an equilib- 
rium one in the course of time evolution. The latter involves various time 
scales characterizing distinct regimes of evolution. We will, in later chap- 
ters in this book, look at the problem of justification of the basic principles 
of equilibrium statistical mechanics in the general context of time evolu- 
tion of large systems (chapters 8, 9, 10), and at simplified evolution equa- 
tions describing various transport processes that reveal a general tendency 
of evolution towards the relevant equilibrium configurations. These general 
non-equilibrium features essentially require a mutual interaction, however 
weak, among the constituent particles making up the system under consid- 
eration. In other words, interacting systems constitute the real content of 


statistical mechanics. 


With this brief look at the dilute gas and the monatomic liquid in the 
present chapter, we will resume the study of interacting systems in chap- 


ter 6, 7 where we consider discrete lattice-based systems (for which a 
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number of exact results are available), and interacting bosons and fermions, 
but before that we devote the next chapter (chapter 5) to a consideration 
of a number of general features of interacting systems relevant to the 
foundations of statistical mechanics. In this sense, chapters 5, as also 


chapters 9, 10 of the present book, are foundational ones. 


367 


Chapter 5 


The Thermodynamic Limit 


5.1 Thermodynamic limit: introduction 


In writing this chapter I draw, to a large extent, from [100], and [121]. 
McCoy’s is a great book on statistical mechanics and is of exceptional 
value for its exhaustive coverage, especially for its crisp summariza- 
tion of a great many results of basic relevance. Ruelle’s book is one 
of the few major works that includes a systematic presentation of 
rigorous results in statistical mechanics, many of them Ruelle’s own 
contribution to the subject. I skip the proofs of most results that I 
state below since these are beyond the scope of this book. I hope this 
will not conflict with my avowed job of building up an overall picture 
for you, the issues discussed in this chapter being foundational ones 


in statistical mechanics. 


As mentioned in the previous chapters, statistical mechanics makes sense 
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in the thermodynamic limit, in which 
NV ; 
Vow, No, yp (finite), (5-1) 


where the notation is by now familiar and pertains to the microcanonical 
and the canonical ensembles while, in the case of the grand canonical 
ensemble, one has to replace N with NV, the mean number of particles. It 
is in this limit that the interactions between the microscopic constituents 
of an assembly of a large number of identical particles (or particles be- 
longing to a number of distinct species) make it behave as a single macro- 
scopic system in a way that can be described in terms of a relatively small 


number of thermodynamic variables. 


Incidentally, along with the mutual interaction among the particles of the 
system under consideration, one is to consider their interactions with the 
particles of the containing vessel as well, where this latter interaction is 
often represented in terms of a number of boundary conditions (these are 
also relevant in describing the mode of exchange of matter and energy 
with external systems). What is more, the interaction of the constituents 
with one or more external fields may also be relevant in the thermody- 
namic behavior. In our present considerations, we will not consider the 
external fields, and will refer to the interactions with the containing vessel 


in terms of appropriate boundary conditions. 


A special case of the thermodynamic behavior corresponds to thermo- 
dynamic equilibrium, which is our principal concern in these first few 


chapters of the present book. 
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It is of great interest to note that the relations between the thermody- 
namic variables are, to a large extent, independent of the actual inter- 
action potential characterizing the system of particles. Indeed, the ther- 
modynamic behavior is even independent, to a large extent, of whether 
the particles are described by classical or quantum principles at the mi- 
croscopic level. However, at the same time, one cannot have just any 
arbitrary interaction potential among the particles for thermodynamic be- 
havior to be possible. A necessary condition for thermodynamic behavior 
is that the potential energy of interaction (®) between the constituents 
should be able to hold the system together as a single macroscopic entity 
without an arbitrarily large number of particles collapsing to a finite and 
small volume or, on the other hand, dispersing to large distances. In the 
latter case, the interaction of the dispersed particles with the walls of the 
containing vessel will have a dominant role, however large the vessel may 
be, and the behavior of the system will no longer be independent of the 


boundary conditions. 


In sections 5.2.1 and 5.2.2 below, I will outline the conditions neces- 
sary to prevent the dispersal of the particles to large separations and the 
collapse of the system to a small volume. These are referred to, in gen- 
eral, as the temperedness and the stability conditions respectively. Having 
stated the two sets of conditions (one stating restrictions on ®!! at large 
separations between the particles, and the other at small separations) I 
will briefly outline, in section 5.3 a number of results relating to the ex- 
istence of the thermodynamic limit. The existence results are intimately 


connected with those relating to the equivalence of the various equilib- 
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rium ensembles of statistical mechanics, to be outlined in section5.4 be- 
low. It is the equivalence of ensembles that allows us to describe a system 
in equilibrium in terms of any of the various possible ensembles of statis- 
tical mechanics (e.g., the microcanonical, canonical, or grand canonical 
ensemble; other ensembles are also possible) as convenient. From the 
point of view of thermodynamics, this corresponds to the equivalence of 
various schemes of description in terms of alternative sets of thermody- 


namic variables related to one another by Legendre transformations. 


The thermodynamic limit is of essential relevance in establishing the pos- 
sibility of phase transitions in a system. This will be briefly reviewed in 


sec. 5.7. 


In the following, 6!) (r!%!) will denote the interaction energy of a system 
made up of N particles located at position vectors r!‘! = {rj,ro,--- ,ry}. 
We will mostly be concerned, for the sake of simplicity, with the case 


when ©!%! reduces to a sum of pair potentials 


eM— S° o(ri-ry); (5-2) 

1<i<j<N 
though the more general case when ©!*! is the resultant of k-particle 
interaction energies ¢‘")(k = 2,3,---,N) can also be considered (¢“) for 
k =1can be set at zero without loss of generality, though the value of the 


chemical potential of the system depends on this choice). 


&\\] and ¢# will be assumed to be invariant under translations, by any 


given fixed vector distance, of the position vectors appearing as argu- 
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ments of these functions and, moreover, ¢ will be assumed to satisfy the 
requirement ¢(r) = ¢(—r) in virtue of invariance under an interchange of 
the particles. In the case of interactions satisfying the requirement of ro- 
tational invariance (this is the case of the interaction between the atoms 


of an inert gas), ¢(r) depends only on r = |r|. 


The temperedness and stability conditions appear primarily as restric- 
tions on © for all possible (and large) values of N. In the case when 
the interaction can be described in terms of a pair potential ¢, these 
imply certain requirements to be satisfied by the latter. However, the re- 
quirements to be satisfied by ® cannot always be translated into a set 
of necessary and sufficient conditions on ¢, though sufficient conditions 
can be stated, in terms of which one arrives at a reasonably thorough 


understanding of what the thermodynamic limit signifies. 


5.2 The temperedness and stability conditions 


5.2.1 The temperedness condition 


The temperedness condition is necessary for establishing the existence of 
thermodynamic limit since it guarantees that the particles making up the 


system under consideration do not disperse to large distances. 


A potential energy function 6!%!, defined classically, is said to be of the 


tempered type if for arbitrarily chosen N,, N2 > 0, with N, + No = N, there 
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exist constants A > 0, Ry) > 0 such that, for « > 0, one has 


—~ N ! U ! Ni No ! ! 
Ww.No = ol l(ry,1o,-°° 5fN,,1T,1.,°°° ,T'y,) — ol l(ry,1o,°°° ing) —@l TC 


AN,No 
= Rte 


(when |r;—r,|>R> Ro (forl <i< NM, 1<j<M)), 


(5-3) 


where this condition is to hold for arbitrarily chosen N (only sufficiently 
large values of N are relevant). In the case when 6! can be expressed as 
a sum of pair potentials, the above condition is satisfied if and only if, for 


some A > 0, Ry > 0, and for any « > 0, 


g(r) < = (for r > Ro). (5-4) 


This essentially means that positive part of the pair interaction between 
particles becomes vanishingly weak at large separations, without the neg- 
ative part being restricted (a similar statement holds for Wy,n, defined 
in (5-3)). The latter, however, is restricted by the stability condition to be 


stated in 5.2.2.1. 


The potential energy &!%! is said to be strongly tempered if one has, for 


some fp > 0, 
Wynn, <0 (when lr; — r’,| > Ro for l= Ny, Lag < Np)). (5-5) 


where the notation is as in (5-3) (with A = 0). In the case when 61! 
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reduces to a sum of pair potentials, this is ensured if 


o(r) <0 (for r > Ro). (5-6) 


This condition is satisfied by a large number of intermolecular potentials 


of interest. 


The temperedness conditions (5-3) and (5-4) are sufficient in the quantum 
context as well, provided the potential energy operator © in the Hamilto- 
nian (1-11) corresponds to the classical potential © satisfying these con- 
ditions. In other words, the classical conditions are sufficient to ensure 


temperedness in the quantum case. 


It may be noted, however, that the proof of the existence of the ther- 
modynamic limit does not require the temperedness condition (5-3) asa 
necessary one, though the latter is a sufficient condition (along with the 
stability condition(s) to be mentioned below) under which the existence 
proof can be conveniently carried through. One particular case of great 
relevance is that of a system of point particles with Coulomb interactions. 
The repulsive Coulomb interaction between like charges violates (5-4) at 
large distances and, since Coulomb interactions are ubiquitous in na- 
ture, this raises serious questions as to how one can really know that the 


particles constituting a system do not fly apart from one another. 


The answer to this lies in the phenomenon of screening and presupposes 
an electrically neutral system of charges. Keeping in mind that the theory 


should describe the thermodynamic behavior of ordinary bulk matter, we 
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consider a system made up of N number of negatively charged particles 
(electrons) and kK number of positively charged particles (nuclei), each of 
the latter bearing a charge Z (a multiple of the electronic charge), where 
all the particles interact by means of the two-body Coulomb potential, and 
where electrical neutrality requires N = ZK. The physics underlying the 
thermodynamic behavior of such a system in the limit NV, K — oo, V > co 
is that, on a macroscopic scale, it can be described as a system of mi- 
croscopic aggregates, made up of oppositely charged particles, that ef- 
fectively behave as neutral particles (whose internal charge distributions 
are screened with respect to external points), the interaction potential be- 
tween which satisfy (5-4). The thermodynamic limit for Coulomb systems 
is discussed in details in [92] (see also [91]), where one observes the fol- 
lowing: assuming that stability against implosion due to the divergence 
of the attractive Coulomb potential is accounted for (by making use of 
quantum principles, see below), the stability against Coulomb repulsion 
at large distances (which falls off more slowly than what is required for 
temperedness) can be ensured in classical terms under the condition of 


overall charge neutrality. 


While the interaction potential can be taken to be spherically sym- 
metric in the case of monatomic matter, there arises a direction de- 
pendence in the case of interactions between molecules, these being, 


generally speaking, dipolar in nature. 


However, quantum theory is operative in preventing the short-range col- 


lapse of the neutral aggregates of oppositely charged particles referred to 
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above, because these aggregates are nothing but the atoms and molecules 
constituting bulk matter. In the short range, the attractive Coulomb in- 
teraction between oppositely charged particles threatens to violate the 
stability of matter (see sec. 5.2.2 below), and the quantum uncertainty 
principle, coupled with the effective repulsive interaction between fermions, 


ensures stability. This will be briefly outlined in sec. 5.2.2.2 below. 


5.2.2 Stability conditions 
5.2.2.1 Stability in the classical theory 


In the classical theory, the condition for thermodynamic stability on the 
potential energy ®!\/(r!%!) (for arbitrarily specified N), ensuring that the 
system under consideration does not collapse to an infinitesimally small 


volume, reads 
Or) > —BN (B>0), (5-7) 


for some non-negative constant B and for all possible choice of r!%) = 


] 


{r1,¥2,--: ,rv}. In this case the canonical partition function Z. satisfies 


the upper bound 


(5-8) 


As a corollary, the grand canonical partition function Z, = 5°, Z| is 


also bounded above (check this out). 
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Refer to [46] (especially, sec. 4.1) for an explanation of the relevance 
of the condition (5-7). Gallavotti’s book is renowned for its meticu- 
lous attention to a great many issues often overlooked in textbooks 
of statistical mechanics, and bears the stamp of his authority in the 


field. 


If the potential energy function can be expressed as a sum of pair poten- 
tials O!) = SO, j<y (ti — rj), with ¢(r) = ¢(—r), then the above require- 
ment imposes restrictions on ¢(r). A sufficient condition on ¢(r) for (5-7) 
to be satisfied is that ¢(r) is a sum of a positive part (¢)(r)), possibly 
discontinuous, and another continuous part (¢2(r)) that has a positive 


Fourier transform (which is real under the condition ¢(r) = ¢(-r)): 


o(r) = b1(r) + d2(r), (5-9a) 


go(k) = / dre** > 0, (5-9b) 


where, moreover, ¢2(k) is to be integrable. A pair potential ¢(r) satisfying 
the above condition is said to be of the positive type. For such a potential, 


¢1(r) is repulsive, while ¢2(r) includes all of the attractive part. 


The above result tells us that a pair potential that can be represented 
as a sum of a positive potential and a potential of the positive type is 


necessarily stable. 


An instance, of substantial relevance, is a pair potential ¢(r) of the Lennard- 


Jones type, defined as follows: (a) ¢ is bounded below, (b) there exist 
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positive constants aj, a2 (a, < a2) and « such that 


1 
o(r) > “ate (for r < ay), 
é(r) > _— fos par (5-10) 


What is more, such a pair potential is superstable (see below). 


A necessary and sufficient condition for stability in the case of an inter- 


action described by a pair potential ¢(r) (with ¢(r) = ¢(—r)) is 


as) ao. (5-11) 


i=1 j=l 


for all positive integers n and for all possible choices of {r,,ro,--- ,r,}, 
where the pair potential is to be an upper semi-continuous one at all 
points, i.e., satisfies the condition that, for any arbitrarily chosen r and 
a > ¢(r), there exists a neighborhood B of r such that ¢(r’) < a when r’ 


belongs to B. 


The above condition ensures that the series (2-96) representing the grand 
partition function converges, so that the latter leads to thermodynamic 


behavior in an unambiguous manner. 


Pair potentials that do not suffice to ensure stability are referred to as 


catastrophic ones. 
Superstability 


A number of considerations in statistical mechanics, including some of 
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those relating to phase transitions, are based on the assumption of super- 
stability, which is stronger than the stability condition (5-7), and requires 


that there exist positive constants A, B such that 


Bl] (yl) > N( aay ge Ay), (5-12a) 


for arbitrarily chosen r!*! within some sufficiently large volume V. In the 
case when ©‘! reduces to a sum of pair potentials, this is ensured if 
g(r) is a sum of the form ¢(r) = ¢;(r) + ¢2(r) where ¢; is stable and ¢ is 


non-negative, continuous, and satisfies ¢2(0) > 0. 


In particular, if ¢.(r) = — with C > 0,¢ > 0, then !! satisfies a stronger 


bound of the form 


pl (plN1) > N( en A(—=)'*8), (5-12b) 


This implies that a potential of the Lennard-Jones type (refer to (5-10)) is 
superstable, satisfying (5-12b). On the other hand, the ideal gas (¢ = 0) 
does not satisfy the condition of superstability (recall that the ideal gas 
does not undergo phase transition). One observes that superstability is 


satisfied if the potential energy grows with the density at least linearly. 


One needs a generalization of the stability conditions outlined above when 
considering, instead of a system of identical particles, one involving a 
finite number of different types or species of particles. The generalization, 
however, is straightforward (see [100], sec. 3.1.4) and, for the sake of 


brevity, will not be stated here explicitly. 
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While the conditions (5-4) and (5-10) including their multi-species gen- 
eralzations (we consider the case of pair interactions for the sake of con- 
creteness; as mentioned above, the multi-species conditions have not 
been stated explicitly) are sufficient to ensure stability in the classical 
theory, they are not necessary ones (an exception is the condition (5-11), 
which specifies a necessary and sufficient condition), i.e., there may exist 


systems that are stable even without obeying these conditions. 


A system of particular interest is one made up of a finite number of 
species of charged particles interacting through the Coulomb potential. 
In the case of point charges, the system is unstable in the classical the- 
ory (however, stability may be restored if the particles are assumed to 
obey quantum principles, see sec. 5.2.2.2 below), but is stable, if the pair 
potential happens to have a hard core, in which case the system need not 


be electrically neutral. 


5.2.2.2 Stability in the quantum theory 


In the case of a quantum mechanical system, the consideration of the 
thermodynamic limit involves problems of a technical nature, since the 
appropriate definition of the Hamiltonian (refer to(1-11) of section 1.3.3.1) 
in the limit of an infinite volume raises a number of basic issues. Without 
going into these questions (refer to [121], section 3.5), we assume that 
a Hamiltonian H!%) can be defined in a state space made up of wave 
functions that are either symmetric or antisymmetric when the particles 
making up the system under consideration are bosons or fermions (in 


the case of a system made up of both bosonic and fermionic particles, the 
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wave functions are to satisfy the appropriate symmetry conditions under 
permutations of only bosons of any specified type or of only fermions of 


any specified type). 


The basic stability condition for the system, which is analogous to the 
classical stability condition (5-7), then states that the minimum eigen- 


value (Ep) of H!*!, referred to as the ground state energy, has to satisfy 


Ey > —BN (B>0), (5-13a) 


for some non-negative constant B and for all N, sufficiently large. Indeed, 
the classical condition implies (5-13a), since the expectation value of the 
quantum mechanical kinetic energy operator is necessarily positive. In 
turn, the above quantum mechanical condition implies that the minimum 
eigenvalue for the system under consideration, when confined within a 
given volume V, sufficiently large, satisfies the same inequality 


E,(V) > -BN (B>0), (5-13b) 


where the definition of the Hamiltonian for the system bounded within V 


needs the specification of some appropriate boundary condition. 


As mentioned in 5.2.2.1, the classical stability condition, though suffi- 
cient for stability in the quantum context, is not necessary. In other 
words, a system may be unstable when considered in terms of the clas- 


sical theory but, for the same potential function ®, may be stable when 
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considered in the more general quantum theory. This, in particular, is 
the case of a system of point particles with mutual interactions described 
by the Coulomb potential. The first major result in this area was pro- 
duced by Dyson and Lenard in a work ( [89]) that is generally considered 
a tour de force in quantum physics. We consider a system made up of, 
say, N number of fermions possessing charge of a fixed sign (say, nega- 
tive), along with kK number of positively charged particles. The negatively 
charged fermions are assumed to belong to g number of types or groups 
(¢ > 1), so that any possible wave function of the system has to satisfy 
the antisymmetry requirement under the interchange of co-ordinates of 
any two particles belonging to any of these g groups. The remaining set 
of positively charged particles can include fermions or bosons, or both. 
The masses of the negatively charged particles are assumed to have an 
upper bound, say, m, while the magnitudes of charges of all the particles 
in the system are assumed to have an upper bound e. All the particles 


are assumed to interact by Coulomb forces. 


One then obtains the bound for the ground sate energy 


Ey > —q2 AN, (5-14) 


as the condition for stability, where A > 0 is a constant whose estimated 
value turned out to be rather large, giving a bound differing widely from 
the actual bound to the ground state energy. However, what is of far 
greater relevance is that Dyson and Lenard’s work established that the 
antisymmetry requirement on the fermionic wave functions (implying the 


Pauli exclusion principle and an effectively repulsive interaction among 
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fermions) can override the effects of the Coulomb attraction between op- 
positely charged particles that makes the pair interaction energy go to 


—oo at vanishingly small distances. 


The role of the antisymmetry requirement becomes apparent when one 
considers a system made up of bosons alone or, more generally, ignores 
the quantum mechanical symmetry requirements on the wave functions 
of a system of particles, in which case one obtains, with positive constants 


B,C 


—BN3 < Ey < —-CN®. (5-15) 
In other words, the ground state energy (as also the free energy at any 
non-zero temperature) of the system goes to —oo faster than the first 


power of the number of particles. 


The Dyson-Lenard result was improved upon by a number of authors, 
including Lieb and Thirring (refer to [91]; Lieb’s review is remarkable for 
its lucidity in view of the major results it presents on the stability of 
Coulomb systems), who produced a greatly lowered estimate for the con- 
stant A of (5-14). In particular, one has the following important result 


referred to as the stability of matter. 
Stability of matter. 


Consider a system made up of N number of electrons (number of spin 
states q = 2; the derivation works for more general values of g) and K 


number of nuclei of charges z; in units of electronic charge, where the nu- 
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clear charges are assumed to be bounded above as z; < z (i = 1,2,:-- , K), 
and where the system need not be electrically neutral. Then, 
K 
(stability of matter :) Eo > —A(N 4 >: 2), (5-16) 
i=1 
where the constant A (« qi) was evaluated by Lieb and Thirring and 
was found to lead to a substantially lower estimate for the magnitude of 
the lower bound to Ep (~ 23 Ry per particle in the case of an electrically 
neutral system with nuclear charges z; = 1 (i = 1,2,--- , kK); this compares 
very favorably with the binding energy of an isolated hydrogen atom) as 
compared to the Dyson-Lenard value (~ 104 Ry per particle). Since the 
z;s have been assumed to be bounded above, the above result implies 
that the lower bound to the ground state energy is proportional to the 


total number of particles (NV + kK). 


5.3 Existence of the thermodynamic limit 


The microcanonical ensemble is, in a sense, the basic equilibrium en- 
semble of statistical mechanics. The issue of the existence of the ther- 
modynamic limit for the microcanonical ensemble can be addresses in 
the classical as also in the quantum context. In either case the proof of 
the existence of the thermodynamic limit is a technically involved one, 
and is not within the scope of the present book. Instead, I will present in 
this section a summary of certain basic results that lead up to the exis- 
tence proof. In section 5.3.2 I follow [121] in outlining (while omitting the 


details of derivation) how the thermodynamic limit is arrived at for a clas- 
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sical system of interacting particles described in terms of continuously 
varying positions and momenta. Section 5.3.3 is a brief introduction to 
Lanford’s approach to the issue of the thermodynamic limit while, finally, 
section 5.3.4 introduces, again in brief outline only, Griffiths’ seminal 
contribution to the theory of the thermodynamic limit, especially in the 


quantum context. 


The literature addressing the existence of the thermodynamic limit for the 
microcanonical ensemble is a vast one, dating from the nineteen fifties 
and continuing to the late nineteen seventies, by which time a number 
of definitive results of basic relevance had appeared (results continue 
to appear to this day). I hope that the present section, with its three 
subsections mentioned above, will give you an idea as to what is involved 


in addressing this fundamental issue in statistical mechanics. 


5.3.1 The Van Hove and Fisher limits 


Consider a fluid comprised of NV number of constituent particles confined 
to a volume V in the absence of external fields, where N and V are suffi- 
ciently large. Empirical observations tell us that for a given value of the 
density p = v. the behavior of the fluid does not depend on the size of 
the system and, moreover, is independent of a number of other param- 
eters on which it could possibly depend. For instance, the shape of the 
boundary surface of the fluid, which actually needs an infinite number of 
parameters for its exact specification, has no role in determining its be- 
havior which, likewise, is independent of the interaction between the fluid 


particles and the walls of the containing vessel (refer to earlier discussion 
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in 1.2.5 and 2.1.3), i.e., on the boundary conditions on the system under 


consideration. 


From the theoretical point of view, all these observations can be justified 
only in the limit N — oo, V > co (x — p, finite), and that too when the 
volume goes to infinity not in any arbitrary manner but subject to certain 
specific requirements that render the additional shape parameters and 
boundary conditions ineffective in the determination of the behavior of 
the system as a whole. In other words, the system behavior depends 
on p and one other parameter, the entropy (the default system under 
consideration is, for the sake of simplicity, a simple fluid), where the 
latter is defined as (refer to (2-58) and the alternative definition of the 


microcanonical ensemble in 2.2.1.3) 
S=limkglnW, (5-17a) 
with 


W= — [ ae dIrdIp, (5-17b) 
and where the notation is familiar by now, differing a little from that 
in sec. 2.2.1; in particular, p will stand for the particle density in the 
thermodynamic limit, and not the phase space distribution function. The 
limit in (5-17a) refers to the thermodynamic limit defined above, and will 
turn out to be independent of whether the microcanonical ensemble is 


defined as in 2.2.1.1 orin 2.2.1.3. 
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The manner in which the volume V is to be made to go to infinity is 
referred to as the Fisher limit. An alternative manner of approach to the 
thermodynamic limit is referred to as the Van Hove limit. These will now 


be described below. 


The Van Hove limit. 


Consider a rectangular region R in space with non-zero edge lengths a = 
{d1, 42, a3} parallel to the co-ordinate axes of a Cartesian system, with one 
corner located at the origin and with the region contained entirely within 
the positive octant (fig. 5-1(A)), for the sake of concreteness. If, now, we 
make aj, a2, a3 go to infinity, the volume (V(a) = a,a2a3) of the rectangular 
region under consideration will be said to go to infinity in the sense of 


Van Hove. 


More generally, consider a region A with volume V(A). Let the translate 
of the rectangular region defined above, by distances a1, n2a2,n3a3 par- 
allel to the axes (with n,,n2,n3 integers) be denoted by R,. Imagining all 
possible values of the integers n1, 2,3, let N* denote the number of such 
translated regions that have a non-zero overlap with the region A, and N~ 
the number of similarly translated regions contained entirely within A. If 


now A is made large in such a manner that 


ae (5-18) 


regardless of the values of ay, a2, a3, then V(A) will be said to go to infinity 


in the sense of Van Hove. 
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The Van Hove limit can be defined in an alternative way. Fig. 5-1(B) 
depicts a region A, of volume V(A), and a ‘surface layer’ A;, of fixed depth 
h, whose volume is denoted by V,(A). If, now, the region A is made to 


grow in size in an unbounded manner such that 


ValA) 


lim V(A) 


=, (5-19) 


regardless of the value of h, then one can say that V(A) — co in the sense 


of Van Hove. 


The Fisher limit. 


The Fisher limit is defined by imposing a stronger condition on the man- 
ner in which a region A is to grow in size. Let D, be the diameter of A, 
i.e., the maximum distance between a pair of points chosen arbitrarily 
within it (refer to fig. 5-2). Consider a ‘shape function’ 7(a) such that 
lima+o7(a) = 0. If, now, A grows in size in such a manner that, for suffi- 


ciently small a, 


— + < aa), (5-20) 


at every stage of growth of the size of A, then V(A) will be said to go to 
infinity in the sense of Fisher. Here V,p,(A) refers to the volume of a 


surface layer of depth aD, within A. 


For instance, consider a rectangular region A, of edge lengths t,t, t?, for 
any t > 0. If, now, we consider a growth in size of A; with t — oo, then 


V(A:) goes to infinity in the sense of Van Hove, but not in the sense of 


388 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


Figure 5-1: Depicting the van Hove limit; (A) R is a rectangular region with 
edge lengths aj, a2,a3, the edges being parallel to the axes of a rectangular co- 
ordinate system; in the figure on the right, Ry, is the translate of R, obtained by 
translations n1a1,72a2,n3a3 parallel to the axes; A is a given region that is made 
large in such a manner that (5-18) is satisfied (with N~,N* defined as in the 
text); V(A) is then said to go to infinity in the sense of Van Hove; the translated 
region R, shown is to be counted in N*; (B) an alternative description of the Van 
Hove limit; A; is a surface layer of depth h within the region A; the Van Hove 
limit then refers to the condition (5-19), which is to be satisfied regardless of the 
value of h. 


Fisher because A; becomes ‘needle-shaped’ in the above limit, with two 
of the ratios of its edge lengths going to zero. In other words, the Fisher 


limit requires the volume of a region to grow equably in all directions, 
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Figure 5-2: Depicting a region A, and a surface layer of depth aD,, where a 
is a small parameter and Dy, is the diameter of A; if the volume of this surface 
layer (V;,(A), with h = aD,) remains small (less than or equal to V(A) times some 
specified ‘shape’ function z(a) for sufficiently small a) as the region A is made 
to grow in size, then one says that the volume becomes infinite in the sense of 
Fisher. 


while the Van Hove limit allows some degree of ‘disproportionate’ growth 


of volume. 


5.3.2 Thermodynamic limit for the classical microcanon- 


cal ensemble 


We first consider, for a region of finite volume V, the configurational mi- 
crocanonical partition function defined (refer to [121]; we use a slightly 


changed notation compared to [121]) as 


a1 


WV.NE) = 55 f d®(r), (5-21) 
N! Jometvl)cz 


where d!I(r) = d@r,;d@ry---d@ ry, and each of the integration variables 
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r; (i = 1,2,--- ,N) ranges through the volume V. The configurational en- 


tropy is defined as 
S(V,N,E) = kg nW(V,N, EF), (5-22) 


This, evidently, is an increasing function of V and F for any given N, and 


can be inverted to give EF as an increasing function of S (reason this out): 
E= E(V,N,S). (5-23) 


(se) 


Further, for fixed N, E is a decreasing function of V (32) g= 7 ( iE 


Os. 
OE] y 


Note further that the stability condition (5-7) implies that the function 


E(V,.N,S) is defined only when 
E>-NB. (5-24) 


One now makes use of the temperedness condition (5-3), to arrive at the 
result that, given two disjoint volumes V,,Vj, with a minimum separa- 
tion r > Ry between them, the following inequality holds (as mentioned 
previously, I state without proof only a few of the intermediate steps of 
derivation leading up to the establishment of the thermodynamic limit), 


AN, Np 


pote 


E(V, + Vo, Ni + No, 51 + S2) < E(Vi, Ni, 51) + E(Va, No, So) + (5-25) 


It is this inequality that ensures that the interaction energy between two 
large disjoint chunks of the system will have no effect on its thermody- 


namic variables like the internal energy. Considering m(> 2) number of 
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disjoint volumes V,,V2,--- ,V,,, one can generalize the above inequality to 


BIS TVA 7 Nn DS) = 1 B(V;, Ni, 5) + 2A NO? (5-26) 


9 pote 
i=1 w=1 7=1 i=1 


The next important step towards the establishment of the thermodynamic 
limit consists of attaining the limit through a sequence of cubes. In the 
Kth step of the process one constructs a cube (in which smaller cubes 


are included, see below) of edge length Lx given by 
be = 2" b= 0" RK, (5-27a) 
where 6 and R are defined by the inequalities 


Me eP<? LoS (5-27b) 


2-6 


with the constants 0,¢ as in sec. 5.2.1. In the next step of the sequence, 
eight translates of the cube are placed within a larger cube of edge length 


Lx+, with mutual separations at least as large as Rx, 


Rre= Dw —2LK = 6* (2 _ O)R > Ro, (5-28) 


as shown in fig. 5-3. 


Then, by a successive application of (5-26) and of the earlier observa- 


tion that, for fixed N,S, E is a decreasing function of V, one obtains the 
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Figure 5-3: Depicting the disposition of eight cubes of edge length Lx (as 
in(5-27a)) in a larger cube of edge length Lx+,, with mutual separations Rx 
(see (5-28)); a planar section depicting four of the eight cubes is shown. 


inequality 
A (Be 
E(View1,8N, 8S) < 8E(Vx,N,S) + ZONY RO ), (5-29) 


Note how the arguments of F get changed from one step of the sequence 
to the next: while Vx = Li, go over to V4, = Li,,, N and S become eight- 
fold. We then introduce variables 6,c¢ (these will in the end represent (up 
to a normalization) the particle density (p) and the configurational entropy 
density (s = 2) such that 86 is a positive integer, and define 

K-1 


A 
cx (6,0) = 8-* E(Vx, 8% 6,8%o) — mu pa aa (5-30) 


m=0 
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which, by (5-29) is seen to satisfy 


cx41(0,0) < cx (6,0). (5-31) 


Note that the requirement that 8“5 be a positive integer ensures that 
8*+1§ is also a positive integer; since the step K of the sequence of 
building larger cubes from smaller ones has been chosen arbitrarily, 6 
is to be defined more generally as a rational number of the form 2~%p, 
with p,q integers; one can then extend 6 to a continuously varying 


real variable. 
This shows that the limit 
Jim cx(d,0) = c(d,0), (5-32) 


exists (the stability condition (5-24) implies a lower bound to the limit). 
Noting that the infinite sum 5°*_, 8” ‘R89 is convergent (by (5-27b), 
(5-28)), one finally concludes that the limit limx_,.. 8-“ E(V,8*6,8*c) ex- 
ists, which we define to be (6,7) (to be identified, up to a normalization, 
with the interaction energy density « = & in the thermodynamic limit, 
which we have now arrived at for the specific sequence of cubes under 


consideration): 


n(d,0) = jim 8-* E(Vic, 8" 6, 8*c) 


= c(5,0) + AS" at RICH), (5-33) 


m=0 
By making use of the temperedness condition (5-25) (which, actually, 
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leads to the extensivity property) and the result that, for fixed N,S, the 
interaction energy F is a decreasing function of V, one further concludes 
that 7 is a convex function of 5,0 where, moreover, it is an increasing 
function of o for fixed 6 (this follows from the observation that E(V, N,S) 


defined in (5-23) is an increasing function of S). 


Once this basic result is arrived at, one proceeds through a series of steps 
of mathematical reasoning as in [121] to extend the result to the case of 
a general sequence of increasing volumes, where the volume tends to 


infinity in the sense of Fisher, and where one finds that 


thermodynamic limit : 

sa 6S o 
TF Te eA 
(then) lim 


E(V,N,8S) 2 
V-o0o V lL?’ 


(5-34) 


ie., the limit 4 = 75 exists, where 7(J,0) has been introduced above. What 
is more, the function 7(6,0) is defined in some convex domain A in the 


d-o plane, and is a convex function in this domain, i.e. for0 <a <1, 


n(ad, + (1 — a)d9, ao, + (1 — @)o2) < an(di, 01) + (1 — a) (62, 2), (5-35) 


where (6;,0;) (i = 1,2) belong to the domain A. Finally, 7 is an increasing 


function of o (for any given 6) within A. 


The domain A is depicted schematically in the figure 5-4. Outside this do- 
main, 7 — oo, i.e., the thermodynamic limit does not exist. With reference 


to points within the domain, one can define, in the three dimensional 
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space made up of co-ordinates 6,0,7, the surface S corresponding to the 


graph of 7, i.e., the set of points (6,0, 7(0,0)). 


ie) 


Figure 5-4: The region A in 6-c plane (schematic) in which the function 7, 
i.e., the thermodynamic limit (up to a normalization) of (= €= ay exists; A 
is an open convex domain in which 7 is a convex function, increasing in o for 
any fixed 6 while, for points outside the domain, 7 — co; the particle density 
(p) and the configurational entropy density (s) in the thermodynamic limit are 
related to 6,7 as p = +, s = 73, where L is a length scale introduced to define the 
thermodynamic limit (refer to (5-27b). 


Instead of considering the function E(V, N,5), one can equally well de- 
velop the theory of the thermodynamic limit for the configurational en- 
tropy S(V,N,£), in an analogous manner, and the result can be inferred 
from (5-34), by making use of the fact that 7 is an increasing function of 


O. 


We introduce the variables « = 4,p = 4,5 = & and, referring to the 
bounded surface S defined by the thermodynamic limit, we denote it by 
the symbol © when considered in the space of variables p,s,«. Corre- 
sponding to the region A in the 6-o plane, one now has a convex region 


O in the p-e plane (figure 5-5), which is the projection of the surface © on 


this plane, the latter being the graph of a function s(p,«) whose existence 
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is thus a consequence of the thermodynamic limit established above: 


If the volume V of a region containing N number of particles, whose po- 
tential energy ® satisfies the temperedness and stability conditions, goes 
to infinity in the sense of Fisher such that lim* — p,lim® — «, then 
lim ¢S(V, N,E) exists when (p,¢) belongs to the region © in the p-c plane, 
and is given by the function s(p,«) such that (p, €, s(p,€)) belongs to the sur- 
face % introduced above. The configurational entropy density s is a concave 


continuous function of p,¢« in QO, increasing in « for any given p. 


The stability condition (5-7) implies that, for points belonging to , the 


interaction energy density « satisfies the bound 
€ > —pB, (5-36a) 
while the configurational entropy density satisfies the bound 
s < kp(p — plnp), (5-36b) 


in virtue of the inequality S(V, N, E) < kp that follows from the defini- 


tion (5-21), (5-22) (reason this out). 


Further, for points belonging to » (recall that ¥ is the set of points (p, €, s(p, €)) 
such that for given values of p,c, the thermodynamic limit exists), p sat- 


isfies the bound 


PS Po, (5-36c) 
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where fp = oo provided the interaction is not characterized by a hard core 
(the fluid can be compressed to an infinitely large density) while, in the 
case of an interaction with a hard core, fp is finite, corresponding to the 


close packing density. 


Though the above bounds are satisfied for points on ©, the boundary of 
the region O is made up, as shown in fig. 5-5, by the half-line p = 0,¢ > 0 
and the curve ¢ = ¢,(p) where, for any given p < po, €) is the minimum 
value attained by ¢« for s ranging over possible values in ©. For points 
lying outside the boundary of 0, the thermodynamic limit does not exist, 


and one has s = —co. 


.G) 


(P,€,(P)) 


Figure 5-5: The region 0 in p-e plane (schematic) in which the function s(p,«), 
i.e., the thermodynamic limit (up to a normalization) of c(= — a exists; sis a 
concave continuous function in 0, increasing in « for fixed p; the boundary of O 
is made up of the half-line p = 0,« > 0, and the curve eo(p), which lies above the 
line « = —pB; po is the upper bound on 9»; a point on the boundary for a given 


p > 0 is shown. 


For any given p (0 < p < po) belonging to &, the configurational entropy 


density s increases with the interaction energy density «, starting from 
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s — —oo for € = eo(p) till it reaches the maximum attainable value § (say), 
the minimum value of « at which this maximum value of s is attained 
being, say ¢«:(p) (for « > «:, s may remain constant). Thus, s is a strictly 
increasing function of « in the interval (e9(p) < € < €,(p)) (S is finite in virtue 


of the bound(5-36b))). 


As a corollary, one finds that the thermodynamic limit for the configura- 
tional entropy defined by (5-22) established above remains the same even 


when the latter is defined in terms of the integral 


1 
W(V,N;5E) = — | | d(x), Ona) 
N! E JE-6E<@N|(rIN)<E 


instead of (5-21), with ie — 0 (note that W(N,V;d£F) is not obtained from 
W (N,V, £) by simply replacing F with JF). Indeed defining W(V, N, E—6E) 
by (5-21), on the stipulation that F is replaced with EF — dE, one finds 


W(N, V, E) = W(N,V, E —5E) + W(N,V;6E). (5-37b) 


The thermodynamic limit established above tells us that, for V > oo (with 


some fixed value of * = p), 
W(N,V, BE) ~ et, W(N,V,E-6B) we, (5-37¢) 


where one has s > s’, in virtue of the fact that s is a strictly increasing 
function of « (for the fixed value of p) in the interval (€9(p) < € < €,(p)). This 


implies that the thermodynamic limit exists for ¢InW(N,V;6E) as well 
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and is, precisely, s(p, €) itself: 


2= lim | pero 
V0, % 9p, 8 4¢ JO<E 


= lim | i dr). (5-37d) 
V0, 8 4p, 2 J E-5E<O<E 


The equivalence between the two ensembles holds, for any specified p < 
fo, in the interval eo(p) < € < «\(p) of the interaction energy density; for « > 
€1(p), the equivalence between the two configurational ensembles breaks 


down. 


This, however, is not relevant in the context of the microcanonical 
ensemble considered below where the thermodynamic limit exists for 
arbitrarily large values of the internl energy (U), as distinct from the 


configurational interaction energy (£). 


Finally, we address the question of the existence of the thermodynamic 
limit in the classical microcanonical ensemble, where the kinetic energy 
of the particles constituting the system under consideration is taken into 
account in addition to their interaction energy. Following the definition of 
the microcanonical ensemble outlined in sec. 2.2.1.3, we write the parti- 


tion function as 


1 
= IN] 4 IN] . 
Pal VN) = aiF [. . a pdr, (5-38) 


2m 


where, in the thermodynamic limit, r goes over to the internal energy 


density. Since 6 does not depend on the momenta p; (i = 1,2,--- , NV), one 


N p? 
1=1 2m’ 


can choose a variable , representing the value of >> evaluate the 
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integral over the momenta subject to the constraint that the sum of the 
squared momenta lies within a small range around 2my (corresponding to 
a narrow range 6x of ,; the interaction energy then reduces to U — x), and 
then integrate over ,. We define the kinetic part of the partition function 


as 


1 
Wain(N, X)8x = aay / : dp. (5-39) 
X<Li on 


Pi cytdy 


42m 


For large N, one can evaluate the integral on the right hand side by re- 


ferring to (3-17) and obtain 


Wain(N,x) = (5-40) 


ne QI) 


2 


(check this out). As indicated above, the total microcanonical partition 


function then appears as 
Zim = f dxWen(N, W(V.N,U =), (5-41) 


where W is defined as in (5-21). We now consider the limit, V — oo in the 
sense of Fisher, such that z — p, v — u. The kinetic part of the entropy 


density for internal energy U is obtained in this limit as 


Armu 
ah? p 


1 3 
Skin(p, U) = kp jim v In Win = 5 kBe( In + 1). (5-42) 
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The integrand in (5-41) then appears, for sufficiently large V, as 
Wiin(N, x)W (VV, N, U — x) x exp (Vk (Skin (p, u’) + s(p, U— u’))), (5-43) 


where wu’ = *. In the thermodynamic limit, it turns out that the leading 
contribution to the integral (5-41) comes from a small interval around a 


value of . for which 
Stot(P, U; U') = Sxin(p, u') + s(p,u—u’), (5-44) 


considered as a function of u’, is a maximum. In other words, the total 
internal energy U = uV is distributed into a kinetic part (u’V) and a po- 
tential part ((u—u’)V) such that the total entropy V s;.; is a maximum (the 
possible range of variation of u’ for any given value of pis 0 < u’ < u—€9(p), 
where ¢9(~) is the minimum attainable potential energy introduced above). 
It then turns out that the following result holds: the clssical microcanoni- 


cal entropy density 


1 N U 
Sige Pt) = jm V ln Za(N.VU) (a= ina pu= lim vy (5-45) 


exists in the thermodynamic limit, and is an increasing function of u_ for any 
given p, where p,u satisfy 0 < p < po, u > €9. The entropy, moreover, is a 


concave function of p, u. 


In addressing the existence of the thermodynamic limit for the classical 
microcanonical ensemble we have considered here the phase space prob- 


ability distribution defined in sec. 2.2.1.3, while the definition stated in 
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sec. 2.2.1.1 is an effectively equivalent one. In other words, for a large 
system, with the number of particles (and the volume) going to infinity, 
the logarithmic measure of the phase space corresponding to internal en- 
ergy lying in the range 0 to U is, in the leading order of approximation, the 


same as that for the internal energy lying between U and U+06U (dU << U). 


The results of the above considerations leading to the existence of the 
thermodynamic limit for the microcanonical ensemble hold for a large 
class of boundary conditions satisfied by the system, including the one 
corresponding to an impenetrable boundary of the volume V at each stage 
of the limiting process established through a sequence of successively 


increasing volumes. 


McCoy (in ref. [100], sec. 3.3) outlines an analogous proof for the exis- 
tence of the thermodynamic limit for a system described in terms of the 


canonical ensemble. 


The thermodynamic limit turns out to be independent of the boundary con- 
dition for a system away from a phase transition. If the physical condi- 
tions on the system correspond to the occurrence of a phase transition, the 
uniqueness of the thermodynamic state is compromised, the equivalence 
of the microcanonical and the canonical ensembles in the thermodynamic 
limit is violated, and the boundary conditions become relevant in determin- 
ing the thermodynamic state of the system under consideration. For further 


elaboration see sections 5.4, 5.7, and 6.2.4. 
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5.3.3 Thermodynamic limit: Lanford’s approach 


In an influential paper, Lanford ( [87]) provided clarification on a large 
number of specific aspects of the thermodynamic limit. In this section I 


summarize without derivation the basic idea he started from. 


Lanford considered a special class of ‘observables’ for large systems (char- 
acterized by the parameters N and V, both of which are made to go to 
infinity) — ones for which he could establish the thermodynamic limit 
without going into unnecessary complications, where the observables 
were functions defined in the configuration space and, where potential 
energy ® belonged to this special class of observables. For any given JN, 
the number of particles, the 3-dimensional configuration space is made 
up of components of the co-ordinate vectors r),r2,--- ,ry and a typical 


observable f was chosen to have the following characteristic features: 
1. (Continuity:) f(r1,r2,---,ry) is continuous in the particle 
co-ordinates. 


2. (Symmetry:) The function f is invariant under arbitrary 


permutations of the particle co-ordinates r,,r2,--- , ry. 


3. (Translation invariance:) f is invariant under translation 
of each of the vectors r; (i = 1,2,--- ,N) by any arbitrarily 


chosen vector a. 


4. (Normalization:) For any specified r, f(r) = 0. This ensures 
that, for any given N > 1, the observable f depends on the 


co-ordinates of all the N particles in a non-trivial way. 


5. (Finite range:) there exists a real number FR such that, for 
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any chosen integers J, kK, and for a system made up of N = 


J + K number of particles, 


Fit, ta »V 7,14, To, °° Vk) = f(t1,82,-- ee) cata Cees tee iar 
whenever |r; —r,| > R forall j (1<j<J), k(1<k<K) 


(5-46) 


This requirement, however, is one of convenience and can 
be relaxed to one where the above equality is satisfied in 
the leading order of approximation when the separation be- 
tween the two groups of position vectors {r1,r2,--- ,r;} and 
{r,1,::: ,1,} is made to be sufficiently large. Making the 
assumption of finite range, the smallest R for which (5-46) 
holds, is termed the range of the observable under consid- 


eration. 


As an example of a finite range observable, consider a system for which 
the interaction energy ®!"! for each and every specified value of N can be 
expressed in terms of a pair potential ¢ (as in (5-2)), where ¢(r) vanishes 
for r > R for some finite R. The interaction ® then is an observable of 
finite range. More generally, an interaction can be of finite range even 


when it cannot be expressed in terms of a pair potential. 


Lanford considered the configurational microcanonical ensemble describ- 

ing an equilibrium state of the system under consideration corresponding 

to the interaction energy ® lying within a small open interval such that 
olN] 


the energy per particle “, lies within a specified small range J. around 
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some chosen value «, forall NV: 


ply] 
WN =eEE T., (5-47a) 


and examined the probability that the value of an observable f, of the 


type described above, lies within some given interval, say, J, 


fed (5-47b) 
when the microstate of the system satisfies (5-47a), in the limit V > oo 
(in the sense of Van Hove), N > o, x — p. For any given V,N, this 
probability is given by the ratio of the volume measures V(V, N; J, /.) and 
V(V,N;I.), where the former corresponds to the region of configuration 
space defined simultaneously by (5-47a) and (5-47b), and the latter to 
the region defined by (5-47a) alone (the two regions in the configuration 
space are depicted schematically in fig. 5-6), each of the two measures 
being defined with a normalization constant + (chosen for the sake of 


convenience) that gets canceled in the ratio of the two: 


1. V(V, N; I.) = 4 volume measure under condition (5-47a), 


2. V(V,N; J, 1.) = Mi x volume measure under condition (5-47 a), 


(5-47b). 


One considers the probability mentioned above for arbitrarily specified 
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configuration space 


aoe 
os "== 
tak 


i VEN L) 


. 
o 
o* 


VGUN;SI,) 


Figure 5-6: The regions in the configuration space defined by conditions 
(a) (5-47a), and (b) jointly by (5-47a) and (5-47b), with volume measures 
V(V,N;I.) and V(V, .N; J, I.) respectively (schematic); as the thermodynamic limit 
is approached, the latter behaves like eN7(*/<), i.e., the limit (5-49) exists. 


V, N, and goes over to the thermodynamic limit, keeping J, /. fixed: 


VV, N; J, Ie) 
N; Ji.) = —-—.——_ 
r(V, oul ) VV, N; 1) 
rd, iz) = lmr(V,N; J, 2), (5-48) 


where the denominator in the right hand side of the first equality is ob- 
tained from the numerator by replacing J with the interval (—oo, oo), i.e., 


by removing the constraint on Z. 


Lanford considered the above limit first by looking at a sequence of cubes 
whose volumes are made to go to oo in a controlled manner, and then 
by going over to regions of more general type, subject to the Van Hove 


condition, as the total volume is made to grow. 


The main result of the analysis is the following: the asymptotic behavior 


of V(V, N; J, I.) as the thermodynamic limit specified above is approached, 
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is 
either, (a) V(V, N; J,[.) goes to zero faster than exponentially in N 


or, (b) V(V,N; J, Ie) ~ eN7%44), where s(p; J, I.) is finite and depends on 


V,.N only through p. 


In other words, the limit 
1 
q( 5 J, 1.) = Tmt NV nV(V,N; J, 1), (5-49) 


exists and is finite unless the possibility (a) mentioned above holds, though 
the latter does not represent the typical situation provided that (5-47b) 
is compatible with (5-47a) for sufficiently large V,N. The symbol o in- 
troduced above actually represents the configurational entropy (refer to 
sec. 5.3.2) per particle (up to a factor of kgz') when J stands for the interval 


(—oo, co), ie., when £ is unconstrained. 


While o(p; J,/.), for given p,/, has been defined as a function over open 
sets J, one can define a point function o(x) for points x (we suppress the 
arguments p,/,., of which the latter stands for an interval) by considering, 
for any given real number z, a family of open intervals J containing x, and 
then defining o(x) as the minimum of o(/) for J belonging to this family. 
This function o(z) then turns out to be upper semi-continuous (i.e., for 
any x and any 7, there exists a 6 such that |y—2x| < 6 implies o(y) < o(x)+n7) 
and concave in x (this follows from the assumption of finite range of the 
observable /) and, starting from o() one can reconstruct o(./) by defining, 


for any given open set J, o(J) as the maximum of o(x) for x belonging to 
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The interpretation of the function o(x) introduced above is the following: 
starting from V(V,.N; J) (with a given J., which we keep implied) for an 
arbitrarily small interval J around x, and assuming V, N to approach the 
thermodynamic limit, one can write, in the leading order of approxima- 


tion, 
V(V, N; J) & eNe@), (5-50) 


The function o(x) is concave in x and is bounded above (a(x) < 1 — Inp, 
refer to (5-36b), where the notation differs slightly). Consequently, its 
variation with x can possibly be of any of the three types shown in 5-7. 
Fig. 5-7(A) depicts a case in which o does not assume its maximum value 
for any finite « which, however, is ruled out for systems characterized by 


superstable interactions (refer to formula (5-12a), sec. 5.2.2.1). 


In fig. 5-7(B), one observes that o(x) assumes its maximum value exactly 
once, at the point x = xo. The interpretation of o as in (5-50) then im- 
plies that, for any chosen 7, however small, the probability of £ attaining 
any value outside the interval (1) — 7,7) + 7) goes to zero exponentially 
with the size of the system. In other words, the probability distribution 
of 4 resembles a delta function, being concentrated at one single value 
(%o) which, evidently, equals its average value in the configurational mi- 
crocanonical ensemble. This establishes, in this ensemble, the existence 
of the thermodynamic limit. The demonstration of the existence of the 


thermodynamic limit in the classical micrcanonical ensemble, with the 
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kinetic energy of the particles taken into account, then follows from con- 
siderations analogous to the ones outlined in sec. 5.3.2. The situation 
depicted in the figure turns out to be the typical one for finite-range two- 


particle observables in the low density regime. 


The case shown in fig. 5-7(C) corresponds to the situation in which o(z) 
assumes its maximum value throughout an interval [x,, x2]. In this case, 
the probability of £ attaining any value outside this interval goes to zero 
as the thermodynamic limit is approached, but the probability distribu- 
tion within this interval is less clear. This situation may or may not 


correspond to a phase transition. 


Analogous to the reduction from o(J/, J.) (for any given p) to the point func- 
tion o(x,/.) (where we now make explicit the dependence on the interval 
I), one can reduce to a function o(x,¢), where one considers a limiting 
situation in which the small energy interval /. shrinks to a point « depict- 
ing a sharply defined value of a”. This function is jointly concave in z, « 
and is continuous in the two variables within the set in the x-e plane in 


which it is finite (excepting, possibly, on the boundary of this set). 


As mentioned above, the thermodynamic limit is established by first con- 
sidering a sequence of cubical regions of increasing volumes and then, 
more generally, by considering a sequence of regions of successively in- 
creasing volumes. Only partial results are obtained when the volumes of 
the regions are assumed to grow in accordance with the Van Hove con- 
dition, while the more complete results outlined above are arrived at by 


considering a limiting transition of a more restrictive nature (such the 
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o(x) o(x) 


(A) (B) 
o(x) 


(C) 


Figure 5-7: Depicting the various possible types of variation of the concave 
function o(x) (schematic); (A) o(z) does not attain its maximum value for any 
finite « — this case is ruled out for superstable interactions; (B) o(x) attains its 
maximum value exactly once; this is the typical situation for two-particle observ- 
ables in the low density regime (the one in which a virial expansion holds), and is 
indicative of the existence of the thermodynamic limit; (C) o(x) attains its maxi- 
mum value throughout an interval — this may correspond to the occurrence of 
a phase transition. 


one in accordance with Fisher’s criterion). 


While the above analysis is confined to the consideration of the microcanon- 
ical ensemble, Lanford establishes analogous results on the existence of the 
thermodynamic limit for the canonical and grand canonical ensembles as 


well [87]. 
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5.3.4 Thermodynamic limit: the quantum context 


The quantum mechanical ensembles have been introduced in section 2.1, 
and the grand canonical ensemble has been invoked in describing the 
behavior of the ideal gas - the limiting instance of a real dilute gas - in 
sec. 3.3, where one finds that the thermodynamic limit exists in the sense 
that the pressure and the specific internal energy are expressed in terms 
of well defined functions of the density and the temperature in the limit 


Vow, NO oO, x p. 


More generally, the thermodynamic limit exists for interacting quantum 
systems whose constituent particles interact between themselves through 
a potential energy function satisfying the temperedness and stability con- 
ditions stated in sec. 5.2. In the present section, I outline, following [61], 
the proof of existence of the thermodynamic limit for a model quantum 
mechanical system made up of interacting spins on an infinitely extended 
lattice where one does not have to deal with the problem of unbounded- 
ness of the quantum mechanical Hamiltonian for a fluid system arising 


from the kinetic energy operator of the particles constituting the system. 


An example of a spin system of the type to be considered below is the so- 
called Heisenberg model of ferromagnetism where one has an infinitely 
extended set of discretely distributed lattice points, specified by means 
of the index i, with associated vector spin operators S; such that the 


Hamiltonian is given by 


H=-23 >| 8;-$;. (5-51) 


<ij> 
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In this expression the summation is over all nearest neighbor pairs (7-7) 
of lattice points, and / is a constant specifying the strength of interaction 
between the ‘spins’ at the various lattice points, where a positive value of 


J results in the system acquiring the features of ferromagnetic behavior. 


A quantum mechanical spin is represented by a vector operator S, 
with components S,, Si; S., where any of the components, say, 
S., has eigenvalues /s, 2(s—1),--- ,—4s for some integral or half 
integral quantum number s (successive eigenvalues differing 
by 4), and where the squared spin operator S? = $? + $? + S? 
has the eigenvalue  s(s +1). The components satisfy, between 


themselves, commutation relations of the form [Si sal = 0S. 


We consider a Hamiltonian (H) of a more general structure satisfying the 


following requirements: 


1. The Hamiltonian possesses the translational symmetry of the lattice. 


2. It is a sum of terms, where each term involves a finite number of 
spins, no two of which has a separation larger than a finite range 
(say, r), independent of the term considered, and where the term 


represents a Hermitian operator with finite matrix elements. 


3. Each spin occurs in a finite number of terms in the Hamiltonian. 


We define the norm |A| of a Hermitian operator A as the maximum of its 


eigenvalues. For any two Hermitian operators A, B, the norm satisfies 
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the triangle inequality 
|A + B| < |A] + |B. (5-52) 


We now consider the terms in the Hamiltonian that contain any specified 
(say, the pth) spin and denote these by h,, i = 1,2,--- ,n), where n is finite 
in virtue of the fact that each spin occurs in only a finite number of terms 


in H. Invoking the triangle inequality, we have 


1S hel < So hel, (5-53a) 


which is bounded above by some constant independent of p in virtue of 


the requirements, stated above, on the Hamiltonian, i.e., 
| So hil < C (say). (5-53b) 


Thus, if H’ denotes the sum of certain terms of H and involves at most 


number of spins, then 
|H'| < CM. (5-530) 


In particular, setting H’ = H, one concludes that all the eigenvalues of H 
for a lattice containing V spins lie in the range |—-CV, CV]. The thermody- 


namic limit for this spin system corresponds to V —> oo. 


The thermodynamic limit we will consider will involve a sequence of bounded 


spin systems on the lattice where, in each stage of the sequence, the spin 
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system will be described by a Hamiltonian operator of the type described 
above, possessing discrete and bounded eigenvalues. For a system with 
Hamiltonian H, let E be any specified energy which we treat as a param- 
eter for the time being. Let the number of eigenstates of the system with 
energies less than or equal to E be W(E) (W will be finite for all finite E 
for systems we consider below). The entropy corresponding to the energy 


parameter F is defined as 


S(E) = kg nW(B), (5-54) 


which corresponds to the alternative definition of the microcanonical en- 
semble outlined in sec. 2.1.1.1. In the thermodynamic limit, this will be 
equivalent to the microcanonical ensemble in which all eigenstates be- 
longing to a narrow energy range FE to E + 6E are equally probable, and 
the entropy for energy F is defined by (5-54) where now W(£) is the num- 
ber of eigenstates belonging to the narrow energy range, the logarithm of 
which becomes independent of 5# in the thermodynamic limit and tends 
to the logarithm of the number of eigenstates with energy less than or 


equal to E (see below). 


The function W(£) is left-discontinuous at the energy eigenvalues E,, (n = 
1,2,---) (See fig. 5-8(A); the energy eigenvalues are labeled from the lowest 
upward with increasing values of the index n; the eigenfunctions are la- 
beled |¢,,) where we have assumed the eigenvalues to be non-degenerate 
for the sake of simplicity; the arguments can be extended to the case of 
degenerate eigenvalues as well). Correspondingly, the function U(x) is 


defined as the energy (£,,) of the eigenstate |¢,) for n —1 < « <n, which 
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means U(x) is right-discontinuous at the integers x = n (see fig. 5-8(B)). 
If the number of interacting spins in the lattice (at any specified stage of 
going over to the thermodynamic limit, see below) be V, then we define 


the normalized quantities 


o=V1S, €=V—'E, (5-55) 


which correspond to two functions o(c) and ¢«(c) that satisfy 


o(e) =V 'nW(Ve), e(c) =V1U(e"”). (5-56) 


For any given V, the function o(c) is defined for energies (EF = Ve) greater 
than or equal to the ground state energy of the system (for smaller values 
of E, o goes to —ov), while e(c) is defined for —co < o < om, where o < 0 
corresponds to the ground state energy, and where o,, turns out to be 


finite for the spin systems we consider below. 


Before we look at the question as to whether the function ¢(c) and its 
inverse o(c) actually exist in the thermodynamic limit V — oo, we need 


the following result from linear algebra. 


Consider a Hermitian operator H in a finite dimensional linear vector 
space H, in which M is a m-dimensional linear subspace. If, for every 
normalized |¢) in M the following inequality is satisfied for some specified 


E, 


(6|H|¢) < E, (5-57a) 
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WE) | | Ux) | 
W= n UH fa 
iW=n-l } U=E, 
— oe ee oe —_ 3 _ qo —- 
E,, 1 E, Hey . +1 ° 
(A) (B) 


Figure 5-8: Depicting the functions (A) W(E) and (B) U(x) defined in the text 
(schematic); in (A), the function W(£), counting the number of energy eigen- 
states with eigenvalues less than or equal to EF is left-discontinuous at the en- 
ergy eigenvalues (the figure shows a few successive energies along the abscissa); 
in (B) the function U(x), giving the energy E,, for n — 1 < x < n for integral values 
of n, is right-discontinuous at the integers (a few successive integer values of x 
are shown). 


then 


W(E) > m, i.e., U(m) < E, (5-57b) 


(reason this out; consider a basis in M made up of the energy eigenvec- 
tors and look at (5-57a) when |¢) is the eigenvector corresponding to the 
largest eigenvalue). This result holds for an infinite dimensional Hilbert 
space as well, provided that H is a self-adjoint operator with a discrete 
spectrum bounded from below, and M belongs to the domain in which 
H is defined. The finite dimensional version, however, will suffice for the 


spin system we consider here. 


With this background we now see how the thermodynamic limit is estab- 


lished for a spin system of the type referred to above. 
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Considering two spin systems distinguished by indices 1, 2 (e.g., spins in 
two halves of a cube), we write the Hamiltonian of the composite system 


as 


AH =H, +H.4+H’, (5-58) 


where the first two terms refer to the two sub-systems considered in- 
dependently of each other and the third term denotes the interaction 
of the spins belonging to the two. Let |w); (¢ = 1,2,---,n) denote the 
eigenvectors belonging to the lowest n eigenvalues of H, (we assume non- 
degenerate eigenvalues for the sake of simplicity; however, the arguments 
apply to the case of degenerate eigenvalues as well), and similarly, let 
lx); @ = 1,2,:-+,m) be the lowest m eigenvectors of H,. The product 
vectors |w;)|yv;) (1 < i < n, 1 < j < m) then constitute a basis of a nm 


dimensional subspace, in which any vector |¢) satisfies 


(|H|o) < Uy(n) + Us(m) + (¢|H'|), (5-59) 


(reason this out; set |¢) to be the product vector |w,,)|xn)). Invoking the 
result expressed in (5-57a), (5-57b) (setting the right hand side of (5-59) 


as E, and noting that (¢|H’|¢) is bounded above by ||), one obtains 


U(nm) < U,(n) + U2(m) + | A. (5-60) 


As is evident here, U,, U2, U refer to A, Hp, H respectively. Though 
written for integer values of n, m, the above equation holds for real n, m 


greater than zero and less than or equal to the total number of levels of 


418 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


H,, H» respectively, in accordance with the definition in the paragraph 


preceding (5-55). 


Generalizing to a spin system comprised of, say, / number of subsystems 


denoted by indices 7 = 1,2,--- ,/ (J > 2), one obtains 


i I 
U([[ni) < Di (ni) + |H', (5-61) 
tL 4=1, 


where H’ includes all terms in H involving spins of more than one sub- 
systems. The inequality (5-61) finally allows us to arrive at the thermo- 


dynamic limit as follows. 


We consider, without loss of generality ( [61]), a simple cubic lattice with 
lattice constant 1, and imagine a sequence of cubical regions 2; (k = 
1,2,---), where 9; (kK = 2,3,---) contains eight cubes of type Q;,_; and 
includes V;, = 2?* spins. We apply (5-61) to 9; with / = 8, with four of 
those characterized by o = 9, (i.e., nj = e”—% for i = 1,--- ,4), and the 
remaining four by o = 02 (n; = e”*-1” for i = 5,--- ,8; we make use of the 
index k to refer to the cubical regions in the successive stages of attaining 


the thermodynamic limit). Formula (5-61) then gives 


1 1 
€,(=01 + ~o2) = =ep_-1(01) + 


1 1s 
=p ==" -62 
5 5 5 €x—1(2) + V7; fF (5-62) 


2 


where both sides have been divided by V,. 


Each term in H’ involves spins in at least two of the smaller cubes and 
involves, by the assumption of finite range of interaction among the spins, 


only those spins that are located within a distance of r from each of the 
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common surfaces of the smaller cubes (i.e., those of volume V;,_, in the 
present instance). Since there are twelve such common surfaces, each 
containing 2?" spins, we can use (5-53c) with M = 12r2?" in estimating the 


last term on the right hand side of (5-62), so as to obtain 
oF <12rC27* =C'?-* (say), (5-63a) 
k 


where C” is independent of k. One thereby obtains (setting 0, = 02 = 0) 


Gi G’ 
€n(7) + — < e%-1(0) + 5Ra1? 


= (5-63b) 


an inequality analogous to (5-31) that led to the thermodynamic limit for 
the classical microcanonical ensemble for a gas made up of interacting 


particles. 


As we have already seen, the energy eigenvalues of a lattice involving V;, 
spins is bounded below by —V;,C, where C is independent of k. Thus, 
the decreasing sequence «(a) + g is bounded below, and hence must 
approach a limit as k — oo, which we write as «(c). This establishes 
the thermodynamic limit for «(o) in the special case when the limit is 
approached through the special sequence of cubical regions 2; indicated 


above. 


The inequality (5-62), together with the fact that «(7) is bounded above 
by the left hand side of (5-63b) (for any chosen k), implies that the func- 
tion «(c) is convex and continuous (excepting, perhaps, at the upper limit 


Om Of the range of values of o over which « is defined). Further «(c) is 
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monotone increasing in o (since «,(c) has this property for any given k, 
by definition (5-54)-(5-56)). By the property of convex functions, one ob- 
serves that the derivative T = “ > 0 is defined at all points of the range of 
o for which ¢(c) is defined except, perhaps, at a countable set of points. 


Fig. 5-9 shows the general appearance of the limiting function ¢«(c). 


€ 


Figure 5-9: Depicting the variation (schematic) of the limiting function «(c); €(c) 
is a convex continuous function, increasing in o in the range 0 < ao < om; the 
derivative T = oa is positive and is defined at all values of o except, possibly, at 
a countable number of points; for o < 0, e(7) remains constant at € = 9. 


Having established the thermodynamic limit for a special sequence of cu- 
bical volumes, one can now generalize to an arbitrary sequence of regions 
of successively increasing volumes, provided the approach to the limit is 
constrained by Fisher’s condition (refer to [61], which includes an out- 
line). From the characterization of the function ¢«(c), one can infer the 
existence of the inverse function o(c) in the thermodynamic limit where 
o is again an increasing function of «, but now a concave one, and where 
o is defined for the range « < € < €,, where e) = e(o = 0) is the ground 


state energy of the infinitely extended system and «¢,, specifies the upper 
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bound to its energy spectrum. As mentioned above, the same function 
o(e) results when one defines the microcanonical ensemble in terms of a 
small energy interval E to E+ dE (where 6E is at most of the order of V7! E 
for a system of size V) and the entropy is defined by (5-54), with W being 


the number of energy eigenstates in the said interval. 


While the above derivation of the thermodynamic limit pertains to a lattice 
system, Griffiths (in [61], sec. IV) outlines a derivation for a quantum 
gas where the Hamiltonian includes a term involving the kinetic energy 


operator of the system of particles making up the gas. 


5.4 Equivalence of ensembles 


5.4.1 The equivalence problem: introduction 


The present section will engage our attention on the equivalence prob- 
lem — one that I briefly outlined in sections 1.2.4 and 1.2.5. Recall that 
the thermodynamic description of equilibrium states of a system can be 
given in terms of alternative sets of variables, related to one another by 
the Legendre transformation (as mentioned earlier, a clear exposition of 
this aspect of thermodynamics is to be found in [15], in particular, in 
chapter 5). This is reflected in the fact that the equilibrium ensembles 
of statistical mechanics are all equivalent in the thermodynamic limit. We 
will assume the existence, uniqueness, and continuity of the specific en- 
tropy s (o in the notation of sec. 5.3.4) as a function of «, the specific 


internal energy, and p the particle number of density, and its concavity 
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property. 


Consider a system S of NV number of identical particles confined in a vol- 
ume V, with energy within a narrow range U,, to Un +6Um (6Um << Um, but 
otherwise arbitrary for now) in a mixed stationary state described by the 
microcanonical ensemble (refer to sec. 2.1.1). For the sake of concrete- 
ness, we will consider the quantum mechanical description here, since 
the classical description is in the nature of an approximation. One can 
define the von Neumann entropy of the system (refer to formula (2-7)), 
with which one can associate a quantity analogous to the thermodynamic 
entropy (S,,; refer to formulas (2-2), (2-6)) by multiplying with the Boltz- 
mann constant kp. This entropy, however, is a discontinuous function of 
the energy U,, and depends, moreover, on a large number of variables of a 
contingent nature (such as the detailed shape of the walls of the contain- 
ing vessel and the nature of interactions of the particles with the walls), 
and further progress, making possible meaningful statements about the 


state of the system, is achieved by going over to the thermodynamic limit. 


In this limit, S,, now becomes the thermodynamic entropy, and the cor- 
responding temperature JT is then defined by means of (2-9), while the 
thermodynamic internal energy is identified with the U,, we started with 
(the energy interval 6U,,, if properly interpreted, has no thermodynamic 
relevance). Other relevant thermodynamic functions can then be con- 
structed from U,,, 5,7; (all defined with reference to the microcanonical 


ensemble we started with). In particular, the free energy is given by the 
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formula 


Fo =e = eas (5-64) 


In the thermodynamic limit, the reference to all extensive variables 
can be done away with, and all the relations between the relevant 
state variables can be expressed by making use of the specific values 
(i.e., values per particle or per unit volume; recall that we are consid- 
ering a system of just one single component for the sake of simplicity) 
of the extensive variables, along with the intensive ones. The proof 
of existence of the thermodynamic limit (see sec. 5.3) tells us that 
limy—5o0 Sm exists as a uniquely defined function of limy_,., Om for any 


given value of limy_.. x. 


We now consider the same system (‘same’ in the sense of being described 
in terms of an identical potential energy function as in the previous de- 
scription) with identical values of V and N, but now with a specified value 
of the temperature rather than the energy. We denote the temperature by 
T. that can be imposed on the system by means of a heat bath with which 
it can exchange energy without any change in volume or the number of 
particles. The mixed stationary state of the system will now be described 
by means of the canonical ensemble (with parameter $, = as We can 
now define the free energy (recall that it is not necessary to consider the 
thermodynamic limit at this stage) by means of (2-18) and the entropy 


by (2-20). The state variables so defined can be seen to satisfy the for- 
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mula 


== 76: (5-65) 


where (F) is the mean energy of the system described by the canonical 


ensemble. 


If we now consider the thermodynamic limit, then the state variables 
T., F., S- acquire the meaning of representing the thermodynamic temper- 
ature, free energy, and entropy of the system, while the thermodynamic 
internal energy U. is obtained as the value of (£) in the thermodynamic 


limit. 


The thermodynamic limit of the canonical ensemble exists, as can 
be proved in a manner analogous to the case of the microcanonical 
ensemble. The derivations - technical in nature - in the case of a 
one-component fluid in the quantum context are to be found in [121], 
section 3.5. The corresponding derivation in the classical context is 


outlined in [100]. 


The problem of equivalence between ensembles then consists of showing 
that the variables so defined in the two ensembles in the thermodynamic 


limit are pairwise identical, i.e., 


Ce = Ue, pe Pe = Pes te = i Pn = He, (5-66) 


where F. is defined from (5-65) in the thermodynamic limit. 
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5.4.2 Equivalence of microcanonical and canonical en- 


sembles: the saddle point approximation 


I outline here an informal proof of the equivalence between the canonical 


and the microcanonical ensembles. 


We start from a system of N particles confined within a volume V, inter- 
acting through a potential belonging to the class of potentials for which 
the thermodynamic limit exists, and consider a mixed stationary state 
described in terms of the canonical ensemble characterized by a temper- 
ature T, = (kpG.)~', where we will use the suffix ‘c’ (resp. ‘m’) to indicate 
that some state variable under consideration is being defined with refer- 
ence to the canonical (resp. microcanonical) ensemble of the given system 
(for given values of V, N and for a given interaction potential; we will con- 
sider the limit of large V and N under the constraint that at every stage 
of the limiting process, the two ensembles describe the same system). We 
assume that the system is described classically, but the basic idea of the 


proof carries over to quantum systems as well. 


The canonical partition function is given by the usual phase space inte- 
gral which we express as an integral over the energy E (i.e., the value of 


the Hamiltonian function) of the system as 
Z(B.) = f dB g(B)e*, (5-67) 


where g(£) stands for the density of states at energy E and where the 


integration is over all possible energy values. 
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Considering a narrow energy interval, say, between EF and E+dE, one 


can write 


1 
g(E)SE = aay / dN lrd\\Ip, (5-68) 
u Rp 


where R» stands for the thin ‘energy shell’ in the phase space of the 
system corresponding to the narrow energy range under considera- 
tion (refer to section 2.2.1.1 and to fig. 2-2, where the notation differs 


slightly). 


Refer now to the microcanonical ensemble describing the same system, 
with energy lying in the narrow range bwteen, say £ and FE + do, where 
do(<< E) is assumed to be independent of EF for the sake of simplicity 
(recall that we are here concerned only with an informal proof of equiva- 


lence). The entropy S,,(/) at this energy is given by 
e&e Sm(E) — §y9(E), (5-69) 


(reason this out; refer back to section 2.2.1.1; the right hand side repre- 
sents the volume of the energy shell between F and EF + do in the phase 
space, scaled by the normalization factor =;s,y, i.e., the number of mi- 


crostates within the energy shell). We can thus rewrite (5-67) as 
Zc(Bc) = eP"* = bg! / dE exp (—8-E + kg! Sm(E)) (5-70) 


where, in the thermodynamic limit, F, goes over to the free energy, ex- 


pressed in terms of the variables (., V, N (recall that thermodynamic states 
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can be represented in terms of alternative sets of state variables related 
by Legendre transformations; refer to sections1.2.5 and 1.2.6). We now 
go over to this limit, when the right hand side can be evaluated to a good 
degree of approximation (which becomes an exact one in the limit) by in- 
voking the saddle point technique (see [46], section 2.3, [63]). The saddle 
point or the steepest descent method is generally employed to calculate 


an integral of the form 
= / dre, (5-71) 


where f(x) possesses a Sharp maximum at some x = %p (f’(%o) = 0, f’(xo0) < 
0). Expanding f() around x = x and retaining term up to the second de- 


gree in x one obtains a Gaussian integral which evaluates to 


21 
Ix ,/ ———efo) 5-72 
Fol nee 


(check this out). The degree of accuracy of this approximation can be es- 
timated by expanding f(x) up to terms of the third degree. In the present 
instance, the approximation turns out to be exact in the thermodynamic 


limit, as mentioned above. 


Evaluating the condition f’(zo) = 0 in the present context, one obtains a 


value EF = U,, (the reason for using the suffix ‘m’ will be apparent in a 
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moment) given by 


eeu, = 7 (5-73) 


which tells us that the temperature of the system described by the mi- 
crocanonical ensemble at energy U,, (recall that S,,(/£) stands for the mi- 
crocanonical entropy at energy F) is T, (refer to formula (2-59) and to 


section 2.2.1.2), i.e., in other words 
Ta = ig (5-74) 


provided that U,, is interpreted as being the energy of the system in the 
microcanonical description of the state whose canonical description is 


characterized by the temperature T.. 


The formula (5-70) now tells us that the free energy in the canonical 


description is given by 


F{ Be) = —B>*[In A — Be ex ae ke SU a 


(63) FHT by 8 mAs, (5-75) 


where the definition of the free energy in the microcanonical ensemble 
is invoked (along with formula (5-74)) and where the term involving In A 


turns out to be insignificant in the thermodynamic limit since, follow- 
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ing (5-72), one obtains, for f”(x.) the expression 


fag 


(sa : 


—— 7 
mre, <° (5-76) 


—Bo + k5*Sin(E))) pp, = 


(check this out; note that the negative sign is in accord with the require- 
ment for the saddle point approximation to hold; Ci represents the spe- 


cific heat at constant volume) and 
-1 -1 1 1 2 
—B> "mA= 8, 3 In(27) + Indo — 5 In(kpT;,Cv)|- (5-77) 


Asuming that 59 does not grow with the size N of the system faster than 
some finite power, we find that the term 67'ln A in(5-75) behaves like 
InN while F,, grows with size like N and thus one finally has, in the 


thermodynamic limit, 
2 en (5-78) 


which establishes the equivalence of the free energy in the two ensembles 


in the thermodynamic limit. 


The mean energy U, in the canonical ensemble, given by the expression 
1 _ AcE 
Uo = Z deEg(E)e PF", (5-79a) 
when evaluated in the saddle point approximation outlined above, gives 


U, = ras (5-79b) 
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(check this out) telling us that the identification of U,, as the energy in 
the microcanonical representation is consistent with the equivalence of 


the two ensembles. 


_ OF 
aT.” 


Finally, the entropy in the canonical representation, defined as S, = 
can be seen to coincide with S,,, when one substitutes F,, for F., and 
makes use of the equality T, = T,, after evaluating the derivative with 


respect to T. (check this out). 


The equivalence of the two ensembles in the thermodynamic limit is thus 
established in accordance with (5-66) (in reality, not all of the equalities 


in (5-66) are independent requirements for equivalence). 


The basic idea underlying all this is the fact that, while the canonical 
ensemble represents a probability distribution over microstates of all en- 
ergies in the entire phase space of the system, as compared to the mi- 
crocanonical ensemble which involves only the microstates in a narrow 
energy range, the two become effectively identical in the thermodynamic 
limit since the relative fluctuation in energy goes to zero in the canonical 
ensemble as the thermodynamic limit is approached. Indeed, starting 
from the expression for the mean energy in the canonical ensemble for a 
finite system, 


(B) = 7 / dEEg(E)e °*, (5-80) 


where the integration is over all the possible energies of the system, and 
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then making use of the corresponding expression for (E”), one obtains 


cy = (EH)? = kp? ME 


COR” (5-81) 


(check this out). In the thermodynamic limit, the right hand side of this 


formula gives k3T?C\, which is an extensive variable, telling us that the 


relative fluctuation Pape indeed goes to zero in the thermodynamic 


limit (reason this out). 


This same feature of vanishing relative fluctuations is also revealed in 
the analysis of Lanford, briefly outlined in sec. 5.3.3, where one finds 
that the probability distribution of finding the value of any observable 
quantity (subject to certain physically realistic conditions) approaches a 
delta function in a state in which the energy lies within some given narrow 
range (i.e., in one described by a microcanonical ensemble), the same 
result being also true in the case of a state described by the canonical 
ensemble. According to Lanford’s analysis, the delta function distribution 


is concentrated at the mean value of the observable under consideration. 


Incidentally, one can check the effect of the energy interval 6), which we 
now write as dU,,, on the entropy obtained in the saddle point approxima- 
tion. Evaluating the right hand side of (5-75) in the present context and 


exponentiating, one obtains 


ek — —_0Um__6e(Ue—F.) (5-82) 
(check this out) where we have used the relations U,, = U., Sm = S, with- 
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out affecting the conclusion to be obtained below. Comparing this with 


the exact relation S, = “=":, one concludes that the effect of the energy 


c 


spread 6U,, on the conclusions reached above becomes insignificant if 


OU m 


i 
\/ QrkpCyT? 


a condition that is indeed satisfied if 6U,, remains less than any finite 


< N, (5-83) 


power of N (refer back to paragraph following (3-21)). 


Finally, one can check that the thermodynamic limit is, precisely, the 
condition under which the saddle point approximation gives us the exact 
formulas (5-66). This requires that, in the general setting, one has to ex- 
pand f(a) in the right hand side of (5-73) up to terms of the fourth degree 
in x — x (the term of the third degree in the expansion does not con- 
tribute to the integral) and compare this with the leading approximation 
to I. This exercise leads to the result that In Z can indeed be replaced 


with —(8,,)~'F, provided that 


<4) 
In{1 + 3 Pre << 7 (20) (5-84a) 


where, in the present context, the variable x stands for F, the stationary 


point x for U,,, the derivatives are to be evaluated at EF = U,, and 
f(E) =—-8.E + kg Sn(E). (5-84b) 


On evaluating the left hand side of (5-84a), one finds that it is of the order 


of unity (check this out) while the right hand side has already been seen 
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to be of the order of the size of the system, which goes to infinity. 


The saddle-point approximation establishes the relation of transforma- 
tion between the pairs U,, 5S, and JT, F. which, as we have already men- 
tioned, is an instance of Legendre transformation. The equivalence prob- 
lem in statistical mechanics has an extensive literature devoted to it. 
While our treatment in the present chapter is an informal one, rigor- 
ous results for classical and quantum systems are to be found in [87], 


[61], [54], [13], among others. 


In concluding this section, I mention the fact that, in the context of the 
saddle point approximation of Z., the function f(x) of eq. (5-71) is (replac- 


ing x with EF for easy comparison) 


f(x) > f(®) = exp[-8 FE + Sn(F)), (5-85) 


where, in the thermodynamic limit, 5,,(£) is a concave function of E. 
This implies that the eq. (5-73) (either of the two forms given) has either 
a unique solution U,, or else a continuum of solutions covering a certain 
range (see figures 5-10(A), (B)), of which we have considered only the first 


possibility in the above paragraphs for the sake of concreteness. 


The second possibility, on the other hand, corresponds to the case of 
a phase transition at the temperature T.(= T;,) where the entropy varies 
linearly with internal energy over a range of the latter in which two phases 
coexist. The equivalence between ensembles continues to hold but, in the 


Legendre-transformed representation (i.e., the one described in terms of 
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the canonical ensemble), the phase transition shows up not in the form 
of a range of variation of the free energy F’, but as a discontinuity of the 
derivative of the latter as a function of the temperature. This is precisely 
what was indicated in section 1.2.6 and depicted in fig. 1-2 


S 1 S 


coexistence 
of phases 


slope = T 


's. 
me, 
cy 


(A) (B) 


Figure 5-10: Depicting the two possibilities in respect of the solution of the 
equation (5-73) (either of the two forms given), relating to the concavity of en- 
tropy S as a function of the internal energy U; (A) the solution, which corre- 
sponds to a point on the graph at which the slope is T~!, is unique; (B) the 
solution corresponds to a range of values of U (and of S) where the graph is 
linear (compare with fig. 1-2(A)). 


5.4.3 Equivalence of the canonical and grand canonical 


ensembles 


The grand canonical ensemble involves fluctuations in the number of 
particles in addition to the fluctuations in energy. Of these the latter 
are shared by the canonical ensemble where, as we have already seen, 
the relative energy fluctuations go to zero in the thermodynamic limit, 
implying the equivalence between the canonical and the microcanonical 


ensembles. 
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In a similar manner, the relative fluctuations in the particle number in 


the grand canonical ensemble, defined as = “os also goes to zero in 
the thermodynamic limit. This is readily seen from formula (4-86), which 
implies 


(N2)—(N)?__ kp? 
(NZ No 


(5-86) 


in a notation that is by now familiar, from which one finds that the rel- 
ative fluctuation in the particle number does indeed go to zero in the 


thermodynamic limit. 


5.5 The large deviation principle 


The content of this section is principally based on [138]; see also [105]. 
5.5.1 Large deviation principle: introduction 


Results in the last section (sec. 5.3.3; refer especially to eq. (5-50)) illus- 
trate the large deviation principle — of great relevance in equilibrium and 
non-equilibrium statistical mechanics. In this book, however, I will do no 
more than briefly mention the principle in outline and mention a few of 
its applications. The large deviation principle unifies the entire field of 
statistical mechanics and reveals its power in pointing towards a sound 
foundation of non-equililibrium statistical mechanics, once again proving 
to be a theoretical approach of great generality ([138], [105]), especially in 


the context of long-time behavior of chaotic dynamical systems. 
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The theory of large deviations relates to the exponential decay of proba- 
bilities of large fluctuations in random systems. The term large deviation 
principle in respect of a probability P,, essentially implies the validity of a 


scaling law of the form 
Pag”, (5-87) 


where n is a parameter (such as the number of particles making up a 
system in statistical mechanics, or the time of evolution of a dynamical 


system) assumed to be large, and J is some positive constant. 


One of the most well-known early instances of a large deviation formula 
is Einstein’s fluctuation formula. The relevant theory is briefly sketched in 


sec. 5.5.3.3 below. 


We begin with a simple example illustrating the large deviation princi- 
ple. Consider a sequence of independent and identically distributed (‘TID’) 
random variables {X,, X2,--- ,X,}, each of which follows a Gaussian dis- 
tribution with mean , and variance o”. Let S, stand for the random 
variable giving the sum of the X’s, scaled by n (commonly referred to as 


the ‘sample mean’), 
1 
Sn= => > Xi. (5-88) 


In this case one can work out the probability density (p(s); the probability 
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for S,, having a value between s and s + ds is p(s)ds) for S,, as 


p(s) 4 fea (5-89a) 
Ina? 


(recall that the sum of n number of Gaussian distributions is a Gaussian 


with variance scaled down by a factor of n). For large n, one then has 


p(s) ee) 
ey END 
I(s)= eau ) (5-89b) 


in the sense that lim,-_.., nn = (0. This constitutes an instance of large 
deviation approximation, with rate function I(s), which is convex in s and 
has a minimum at s = y where /(s) = 0. It tells us that, for large n, devia- 
tions of S,, from s = ,, are exponentially rare. In other words, S,, obeys the 
law of large numbers, and the above formula gives a quantitative estimate 
as to how S,, converges to y. Put differently, the rate function gives us a 
detailed picture of fluctuations of the random variable S,, around its most 


probable value (i). 


As another simple instance, consider a set of N number of spins 0), 02,--: , on, 


each having two possible values +1. The number of spin configurations 


(i.e., possible realizations of the spin values) with ‘mean magnetization’ 


l N 
mea yoo, (5-90a) 
i=1 
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is given by 


N! 


= N(1—m N(1+m 
[Maem NOt) )y 


(-1<m<)). (5-90b) 


Assuming that all the spin configurations (‘microstates’ of the system) are 
equally probable, the probability of the mean magnetization m works out, 


for large N, to 


I(m) = In2 4 In In (5-90c) 


One again observes that p(m) satisfies a large deviation principle with rate 
function J(m), which is convex in m, with a minimum at m = 0, where 
I(m) =0. Thus, for NV large, the most probable spin configuration (m = 0) 
is overwhelmingly more probable than all the other possible configura- 
tions. With —/(m) = s(m) interpreted as the entropy per spin, one obtains 
Q(m = 0) = eNs© = e%, the Boltzmann formula for the entropy (5) of 
the equilibrium state m = 0 while, for other values of m (‘non-equilibrium 
states’) eq. (5-90c) illustrates the Einstein fluctuation formula for the sys- 


tem of spins. 


More generally, let A,, be a random variable indexed by the integer n, 
which takes up large values, and let p(A, € B) be the probability that A,, 
takes a value in some specified set B. We say that p(A, € B) satisfies a 


large deviation principle with rate /, if the limit 
; 1 
lim ——Inp(A, € B) = Iz, (5-9 1a) 
noo nN 
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exists. It is in this sense that we express this as 
p(A, € B) ee 8, (5-9 1b) 


While A, in the above formulae is a discrete random variable, the theory 
often deals with continuous random variables as well, as in the case 
of the sum-of-Gaussians 5S, (refer to eq. (5-88)). In order to treat the 
two cases formally on the same footing we use a notation of the form 
p(S, € ds)(= p(S;,) € [s,s + ds]) so as to refer to the probability of S,, having 
a value between s and s + ds. Such notation, however, is just a matter of 


convenience. 


Denoting the probability density describing the distribution of possible val- 
ues of S,, by p(s) (thereby deviating slightly from the notation in (5-89a), 
(5-89b)), one has 


p(S;, € ds) = p(s)ds. (5-92) 


Adopting a notation of the form p(S, € ds), one thus avoids referring to 


probability densities. 


In the case of a continuous random variable mentioned above the large 


deviation principle takes the form 
pP(Sn € ds) & eM Ods, (5-93a) 
where the rate function /(s) is obtained as 


i=in =e eae). (5-93b) 
n 


n> Co 
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under the assumption that ds stands for an infinitesimal but non-zero 


differential quantity. 


Most of the discrete random variables relevant in applications of the large 
deviation principle to statistical mechanics are ones that converge to con- 
tinuous variables in the weak sense, which is why one refers, by default, 


to continuous random variables (unless mentioned otherwise). 


In the literature on large deviation theory, one commonly uses a notation 


of the form 
p(Sn € ds) x eM Ods, (5-94a) 


where the symbol ‘=’ is used in the sense of a logarithmic equality in the 
large-n limit (and with the understanding that a differential such as ds is 


an infinitesimally small but finite quantity). Thus in other words, 


1 1 
An < by, (if) lim —Ina, = lim —InQ,. (5-94b) 


no 1 n->co 1 


5.5.2 Large deviation principle and the rate function 


With these preliminaries in place, we state the Gdrtner-Ellis theorem — of 


fundamental relevance in the large deviation theory — as follows: 


The Gartner-Ellis theorem. Given a real random variable A,, 


parametrized by the positive integer n, we define the scaled cu- 
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mulant generating function \(k), with real k, as 


A(k) = lim 4 In(er*4n), (5-95a) 


where the symbol ‘(-)’ stands for the expectation value under 


the probability distribution for A,: 


(enbAn) — [olan € da)e”**. (5-95b) 


If A(k) is differentiable for all real k, then A,, satisfies a large 


deviation principle, i.e., 


p(An € da) x eda, (5-95c) 


with the rate function /(a) given by 


I(a) = sup[ka — X(k)). (5-95d) 

k 
Here ‘sup’ can, in most cases, be interpreted as the maximum 
value. In formula (5-95d), /(a) is referred to as the Legendre- 
Fenchel transform (LF transform) of \(k). Thus, the Gartner- 
Ellis theorem can be briefly stated as saying that the rate func- 


tion is the LF transform of the cumulant generating function. 


1. The Legendre transform constitutes a special instance of the LF trans- 
form, when the latter is applied to differentiable convex functions, in 


which case the inverse is also a Legendre transform. 
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2. The above outline of the Gartner-Ellis theorem notwithstanding, the 
rate function J(a) need not always be determined in terms of a LF 


transform of \(k), even though the latter exists. 


One can see why the relation (5-95d) should give the rate function by 
substituting for p(A, € da) in (5-95b) from (5-95c) (where one assumes 


that the rate function exists). This gives 
(Che os / ene a, (5-96a) 


The integral on the right hand side can be evaluated for large n by the 
saddle-point approximation (or the Laplace approximation) which tells us 
that the dominant behavior of the integral on the right hand side comes 
from the maximum of ka — I(a) considered as a function of a (we assume 


that the maximum exists and is unique), thereby implying 


A(k) < exp [nsup(ka — I(a))]. (5-96b) 


a 


If \(k) is assumed to be everywhere differentiable then the above LF trans- 


form can be inverted, yielding (5-95d). 


One thus discerns a natural connection between the large deviation prin- 
ciple and the saddle-point approximation, which appears in various dif- 
ferent contexts in statistical physics (refer, for instance, to sec. 5.4.2 


above). 


It turns out that \(k) is always convex and the rate function /(a) is always 


positive. Though the Gartner-Ellis theorem requires \(k) to be everywhere 
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differentiable, this need not always be the case. If, however, \(k) turns 
out to be everywhere differentiable then the LF transform reduces to the 
Legendre transform, in which case, for any given a, k is obtained as the 


unique root of 
Ak) =a, (5-97a) 


giving the point of stationarity relevant to the saddle-point approxima- 


tion. Denoting this by k(a) one obtains he rate function as 
I(a) = k(a)a — X(k(a)). (5-97b) 


We state here an important generalization of the Gartner-ellis theorem: 


If f denotes a continuous function of A,, and the functional \(f) is defined 


as 
ACP) = lim + in(erf4n)s, (5-98a) 
then the following relation holds, 
MG) SUD eh ta) (5-98b) 


This formula, having a wide applicability, is referred to as Varadhan’s 


theorem. 


We end this brief survey of theoretical results relating to the large devia- 


tion principle by stating the so-called contraction principle: 
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Let B, = h(A,) be a random variable defined in terms of a continuous 
function h of A,. In that case, if A, satisfies a large deviation principle 
then so does B, as well, with a rate function /,(b) related to the rate 
function J,4(a) of A, as 


Ip(b) = inf I4(a). (5-99) 


a:h(a)=b 


The relevance of the contraction principle stems from the fact that h(a) 
may be a many-to-one function, in the context of whichit selects out the 
dominant probability from among several possible alternatives for a given 


value of the fluctuation (0) of B,,. 


5.5.3 Large deviation principle and equilibrium statisti- 


cal mechanics 


We now indicate briefly and sketchily, without entering into details, how 
the large deviation principle can be looked upon as a unifying concept 
in equilibrium statistical mechanics (refer once again to [138]). For this 
we first introduce a few terms of reference by way of recapitulation that 
may help in relating the large deviation principle to equilibrium statistical 


mechanics. 


5.5.3.1 Statistical mechanics: microstates and macrostates as ran- 


dom variables 


We consider a system of N number of particles, a microscopic state (‘mi- 
crostate’) of which is specified by a sequence w = {w,w2,--- ,ww}, where 


w; (i = 1,2,--- , N) specifies the dynamical state of the ith particle (in terms 
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of its position and momentum vectors). We look at w as a N-component 
random variable, the set of all possible values of which defines the phase 
space ['y, in which we define the prior probability p(dw) of the random 
variable w as the uniform one corresponding to the Liouville measure 
(based on volume element in the phase space made up of factors like 
dqdp coming from canonically conjugate phase space variables) compati- 
ble with the Hamiltonian equations of motion (refer to sections 1.4, 9.2.2 


for background). 


The prior probability corresponding to phase space volume dw = jaaqqd@™) rd@®)p 
(refer back to sec 1.3.5; observe the slight change in notation) is given by 
Tas where the integral in the denominator is over the entire phase space; 
the latter, for large N, can be written as |A|‘, where |A| is a constant which, 


though divergent, will not be relevant in the present context. 


The probabilistic description of the system under consideration then refers 
to a set of external conditions or constraints on it, relevant to its macro- 
scopic behavior. The latter can be described by referring to a few macro- 


scopic or coarse-grained variables determining the possible ‘macrostates’. 


Refer back to chapter 1 for background. Also, move ahead to section 8.2.2 
for more background on the thermodynamic description of equilibrium and 
non-equilibrium states. A macrostate is described in terms of a relatively 
small number of thermodynamic variables such as Xo, X 1, X2,---, aS out- 
lined in sec. 8.2.2. In the literature, the term macrostate is commonly used 


to denote an equilibrium state under a given set of constraints. More gener- 
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ally, however one refers to fluctuations around the equilibrium state, char- 
acterized by values of the thermodynamic variables (which we denote by 
Mn, see below) deviating from the equilibrium values; the fluctuations may 


be looked upon as defining non-equilibrium macrostates. 


From the mathematical point of view, a macrostate is specified in terms 
of a function My(w) of the microstates, where the sub-index N refers to 
the number of particles making up the system under consideration, and 
where My stands for a vector with a relatively small number of compo- 
nents (corresponding to the thermodynamic variables in terms of which 
the behavior of the system as a whole is specified). We look upon My(w) as 
a random variable, among whose possible values there exists one (or, in 
special circumstances, a few) corresponding to the equilibrium configura- 
tion(s) under the given constraints. Other possible values of My(w) corre- 
spond to non-equilibrium configurations, though not all non-equilibrium 
configurations (such as the instantaneous state of a fluid in turbulent 


motion) can be described in macroscopic terms. 


As outlined in chapter 2, the constraints operating on the system deter- 
mine the statistical ensemble describing its equilibrium state, and the 
equilibrium value of the (vector-valued) random variable My(w). In real- 
ity, the random variable My is characterized by a probability distribution 
(depending on the probability distribution of w) which, for large N, is con- 
centrated overwhelmingly on a single value in accordance with the large 
deviation principle (as specified by the global minimum of the relevant 


rate function), namely the one corresponding to the equilibrium state un- 
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der the given constraints. We will be interested in fluctuations in My and 


the corresponding rate function. 


5.5.3.2 Large deviation principle for energy and free energy per par- 


ticle 


The energy per particle is given by the random variable hy(w) = 4 Hwy(w), 
and the probability p(hy € du) (i.e., the probability of the energy per par- 


ticle having a value between u and u-+ duj)is given by 


1 
N(w)Edu 


hn (w)e€du 


At the same time, the Boltzmann formula for the entropy per particle 


(corresponding to hy € du) gives 


So i ae (5-100b) 


N hn €du 


where s(u) stands for the entropy per particle, in units of kg. One thereby 
concludes that the probability p(hy € du) conforms to a large deviation 


principle, with the rate function J(u) given by 
I(u) = In |Lam|] — s(u). (5-100c) 


In other words, assuming the rate function to exist (which is equivalent 
to assuming the validity of the Boltzmann formula), it seen to be the 
entropy per particle (with a negative sign and in units of ks) up to an 


additive constant. Absorbing the additive constant in the definition of 
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entropy [138], one obtains, for the re-defined entropy, 
= ii : Inp(hy € d (5-101) 
s(u) = lim ay np(hy € du). - 


It is now straightforward to see that the scaled cumulant generating func- 


tion 
A(k) = lim N > 205 ImeneeN (5-102a) 
is given by 
A(k) = —$(8)|g=-z% — In JA, (5-102b) 
where 
(8) = Jim, ln Z(G), Zw(B) = f eM du, (5-102c) 


where Zy(3) stands for the partition sum at inverse temperature 6 (recall 
that we have set kp = 1). On re-defining ¢(() by absorbing the constant 


In |A| into it (as in the case of the specific entropy above), one obtains 


¢(8) = lim -* Infe Vi) (5- 102d) 


N-co 
and a) is identified as the free energy per particle f((). 


On making use of the Gartner-Ellis theorem and Varadhan’s theorem, 
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one then finds that ¢(@) and s(u) are LF transforms of each other, 


o(8) = inf[Bu— s(w)), 


s(u) = inf|Bu — $(8)). (5-103) 


5.5.3.3 The microcanonical ensemble: Einstein’s fluctuation rela- 


tion 


While the preceding considerations apply to an unconstrained system, 
we now look at the constraint Hy(w) = U, ie., hy = 2 = u, for which 
the equilibrium state is known to correspond to the microcanonical en- 


semble. The latter corresponds to the conditional probability measure 


ee ee ee -104 
p(hy € du)’ eo 10t) 


where the denominator is given by (5-100a). We now extend the mi- 
crocanonical measure p!“!(dw) to macrostates represented by the random 
variable My(w) by way of expressing the the conditional probability mea- 


sure (for hy € du) relating to My € dm as 


p(hn € du, Mn € dm) 


“l(My € dm) = 
P ( we ia) p(hy € du) 


’ (5-104b) 


where the numerator represents the joint probability for hy € du, My € 
dm. It now remains to obtain a large deviation principle for p'!(My € dm). 
We assume that (i) a large deviation principle applies to the unconstrained 


measure p(My € dm) with rate function, say, —5(m) (where the negative 
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sign is in deference to common usage), and (ii) there exists a contraction 


of My to hy, i.e, a function h such that hy(w) = h(My(w)). 


For instance, if My) stands for the set of observables {Xo,X1,--- , Xx} (say; 


K << N), then one may have hy = “wu (refer back to section 8.2.2). 


In this case the the equilibrium value of Wy that minimize the rate func- 
tion of the constrained measure p"!(My € dm) is obtained, by making 
use of the contraction principle, as the minimum of the rate function 


for the unconstrained measure p(My € dm), subject to the condition 


h(Myn(w)) = u. 


More specifically, recalling the definition of the specific entropy function 


s(u) (eq. (5-101)), and the relation (by definition of §(m)) 
p(My € dm) x eN&™), (5-105a) 
one obtains 
pl(My € dm) = e NE ™ dm, (5-105b) 


where 


I (m) = s(u) — §(m) if h(m) =u 


(and) = 0 otherwise. (5-105c) 


This implies that the equilibrium state corresponding to My € dm is the 


one for which s(u) = s(m) or, equivalently, the one that maximizes §(m) 
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subject to h(m) = u (the generalized ‘maximum entropy principle’; there 
may be more than one equilibrium states as in the case of a phase tran- 


sition), the maximum value being the microcanonical entropy s(w). 


Eq. (5-105b) (along with (5-105c)) provides a precise formulation of Ein- 
stein’s fluctuation relation while, at the same time, sharpening it in several 


respects [138]. 


5.5.3.4 The canonical ensemble: fluctuations around the equilib- 


rium macroscopic state 


The canonical ensemble is described by the constrained probability (cor- 
responding to the constraint of a fixed value of the temperature, or of the 


inverse temperature 3, instead of a fixed value of the specific energy wu) 


e bHn() 
pp(dw) = “Zntpy Pe); (5-106) 


where Z,((3) stands for the N-particle partition function (second equality 


in (5-102c)). 


Analogous to the microcanonical case, one can formulate a large deviate 
principle for the probability pg(My € dm) of the macrostate My having a 
value between m and m+dm under the constraint of any given value of 6, 


obtaining 


pg(My € dm) x e Ne dm, (5-107a) 


452 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


where the rate function J'®!(m) is found to be [138] 
Ig(m) = Bh(m) — &(m) — ¢(8), (5-107b) 


where all the three terms on the right hand side have been introduced 
before. As in the microcanonical case the minimum of J;(m) identifies the 
most probable or equilibrium macrostate My (there may be more than 
one global minima, as in the case of a phase transition; refer to sec- 
tions 5.4.2, 5.7). The most probable value (m) of the macrostate My also 


corresponds to 
$(8) = inf[33(m) — h(m)], (5-108) 


which expresses the ‘minimum free energy principle’ for the microcanon- 
ical ensemble, where the expression within brackets on the right hand 


side can be interpreted as the macrostate free energy. 


Eq. (5-107a) can be interpreted as the Einstein fluctuation formula in the 


canonical ensemble. 


5.5.3.5 Large deviation principle in statistical mechanics: conclud- 


ing words 


We conclude this brief and sketchy introduction to the large deviation 
principle as one providing a unifying framework for equilibrium statistical 
mechanics by way of mentioning (without entering into details) a few more 
areas of foundational relevance where this unifying role appears to be 


notable. 
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As mentioned earlier, the large deviation principle appears to be of great 
value in explaining and interpreting non-equilibrium processes as well, this 
being currently an active area of investigation. Apart from the number of 
particles N, the large parameter here is the time interval (t) through which 
a process evolves. In this book, however, we will not be concerned with this 


important area of investigations on the large deviation principle. 


We first refer to the problem of existence of the thermodynamic limit. 


Establishing a large deviation principle for a macrostate is equivalent to 
proving the existence of a thermodynamic limit for an entropy function 
in the microcanonical ensemble of a free energy function in the canoni- 
cal ensemble. However, in our above sections, we have either assumed 
the existence of a large deviation principle in order to indicate its con- 
sequences, or else have arrived at one by deriving its rate function by 


contraction from some other rate function. 


The general approach aimed at a direct proof of the existence of the ther- 
modynamic limit, developed by van Hove, Ruelle, Lanford, Griffiths, and 
others, has been outlined in sec. 5.3, where one needs to focus on the 
nature of the interaction between the particles making up a system (tem- 
peredness and stability, relating to the long range and short range be- 
havior of the interaction potential between particles; refer, in particular, 


to formula (5-25)). 


On the other hand, the problem of equivalence between ensembles can 


be conveniently formulated in terms of the large deviation principle since 
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the equivalence proof rests on the application of the saddle-point ap- 
proximation (refer to sec. 5.4.2). However, the equivalence (between the 
microcanonical and the canonical ensembles) is, at times, not of a sim- 
ple nature since the function s(u) (sec. 5.5.3.2; recall that it is —s(u) that 
represents a rate function) is not necessarily concave. In the case of a con- 
cave rate function, there is an equivalence between the two ensembles at 
the thermodynamic level since the correspondence between the entropy 
(s(u)) and the free energy (¢(); recall, however that what is commonly 
referred to as the free energy in thermodynamic literature is a) rather 
than (3) itself) is then one-to-one and is given by a Legendre-Fenchel 
(LF) transformation. In the case of a non-concave entropy function, the 
two ensembles become non-equivalent since the LF transform then does 


not work both ways. 


More precisely, starting from the the transform of a non-concave s(u), 
one obtains ¢(() as the transform, but the transform of ¢(() is not s(w), 
but the envelope of s(u), of which the graph contains a linear part (repre- 
senting coexisting phases; refer to fig. 1-2). It is known that, in the case 
of a non-convex rate function, the scaled cumulant generating function 
becomes non-differentiable, as seen in the graph for ¢(3) (at a value of 
(6 corresponding to the transition temperature). On the other hand, if 
there is no first order phase transition, then the Legendre transform of 


the microcanonical entropy is the canonical free energy. 


One can summarize by saying that the mathematical language of statis- 
tical mechanics is the large deviation theory. Theorems and their corol- 


laries in the large deviation theory lead to an abundance of insights and 
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concrete results in statistical mechanics ([138], [31], [105]). 


5.6 Infinitely large systems: Gibbs states 


In this section we explain the idea of Gibbs states for infinitely extended 
systems. Such states, representing equilibrium configurations of an in- 
finitely extended system can be defined in more than one different but 
equivalent approaches. We consider below (a) the approach based on se- 
quences of local distributions (sec. 5.6.1), (b) Gibbs states specified in 
terms of the variational principle (sec. 5.6.3), (c) Gibbs states defined by 
means of the DLR condition (sec. 5.6.4) and (d) Gibbs states defined in 
terms of the KMS condition (sec. 5.6.5). In all these approaches, the prob- 
lem of existence of the thermodynamic limit is circumvented by specifying 


the relevant equilibrium distributions 


Gibbs states for infinitely extended systems will be referred to in chap- 
ter 6 (sec. 6.2.5) in the context of the Ising model (in spatial dimension 


dimension D > 2). 


5.6.1 Gibbs distribution defined in terms of local distri- 


butions 


Up until now we have considered systems of finite volumes and have 
then gone over to the limit of infinite volume. It is, however, desirable 
to consider an infinitely extended system from the very beginning and to 
define probability distributions for observables defined for such a system. 


An appropriate way of achieving this is to introduce the idea of locally 
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finite distributions. 


Without going into technicalities, and accepting a certain degree of math- 
ematical looseness in our statements, we define, following [46], the no- 
tional phase space (I; refer back to chapter 2, sec. 1.3.2.2) of an infinitely 
extended system as the set of all possible joint sequences of momenta and 
position co-ordinates (r~!, p!°!), where r'~! stands for an infinite sequence 
of the form {r,,r2,r3,---} and similarly, p'~! stands for {pj,, po, p3,---}, the 
position vectors being such that only a finite number of these are located 
within any arbitrarily specified finite volume. Any particular pair of se- 
quences (r!~!, p!°!) then corresponds to a locally finite configuration of the 


system under consideration. 


However, the actual phase space (I) differs from the notional phase space 
for a system of identical particles, the former being obtained from the 
latter by identifying any and every pair of points such as (r'*!, pl!) and 
(r/!_ p’!) related to each other by identical permutations of the momen- 


tum and position vectors. 


A probability distribution in I is defined in terms of an associated infinite 
set of local probability distributions specified by functions of the form 
AM er, re, +++ 13} P1, P2,°:: , Px), Which we abbreviate as FM el, pl), such 
that the expression 
1 
MM (rl®), plX))dlNlrdl\Ip, (5-109) 


han NieV 


gives the probability of N particles being located in volume V with position 
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and momentum vectors between (r!%!, p!%!) and (rl%! + d!‘Jr, p!%! + dlp), 


regardless of the states of all the other particles constituting the system. 


The infinite set of local distributions, each corresponding to some given 
values of V, N, is then said to define a probability distribution (say, G), in 


the phase space of the infinitely extended system under consideration. 


Such an infinite set of local distributions (and hence a distribution G in 
I pertaining to the infinitely extended system) is obtained by first con- 
sidering a grand canonical distribution corresponding to the probability 


density 


g(V',T, p; 2), pl) exp [— 6(H (rl pl) — NJ], (5-110) 


1 
= NIBNZ, 
for a finite system, located within some arbitrarily specified large volume 
V’ and characterized by temperature 7 and chemical potential 1, to have 
N particles located around the point (r!"!, p!%!) in the phase space ['\! 
pertaining to N particles. In the above expression, Z, = e98°V"-T) de- 
notes the grand partition function corresponding to the grand canonical 


potential 2 for such a system. 


The potential energy function of the infinite system, which we de- 


note by &[l*!, is defined in terms of N-particle functions ®!%! for 
N = 2,3,---. It is the function #!%! that is relevant in defining the 
Hamiltonian H!‘! appearing in the above expression. A special case 
corresponds to the situation where all the ®!‘!’s are defined in terms 


of a two-particle potential of the form ¢(r1, r2). 
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Starting from g(V’,7, 4) one can calculate the probability density for the 
system to have N particles located within some given region with volume 
V (where the region is contained within the one with the larger volume 
V’) around the phase point (r!%!, p!’!), regardless of the states of other 
particles (whose number may be of all possible values) that may be lo- 
cated within V’ but outside V. We denote such a probability density by 
fa VICT z: 2], pl) such that the expression 


i, see 
aaa Cet ee rap (5-111) 


gives the probability, for the finite system (characterized by parameters 
V’, Ty) currently under consideration, of N particles being located within 
the range between (r!™!, p!‘!) and (rl! + dlr, pl!) + d!p) in PW) such that 


each of the N particles is located within the volume V. 


We now consider the limit 
1 (T, ps: yl] p)) = im fly) (T, pu: rN] pl), (5-112) 


which (provided it exists for each specified local volume V and particle 
number JN) gives us a set of local distributions corresponding to various 
possible values of V, N (all charaterized by the parameters T, 1). This in- 
finite set, then, defines a probability distribution G(T, .) for the infinitely 
extended system (refer to (5-109) and the paragraphs following it) char- 
acterized by an interaction potential defined in terms of the N-particle 
potential 6!) referred to above. Such a probability distribution, which 


characterizes an equilibrium state of an infinitely extended system, is 
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referred to as a Gibbs distribution or, equivalently, as a Gibbs state. 


Put differently, a Gibbs distribution for an infinitely extended system, 
characterized by parameters 7’, 1, is arrived at by considering a sequence 
of grand canonical distributions with probability densities of the form (5-110), 
with successively increasing values of the (global) volume V’, as V’ > oo. 
It turns out that the limit exists under a large class of boundary condi- 
tions including, in particular, the one of an impenetrable boundary en- 
closing the global volume, provided that the latter is made to go to in- 
finity in accordance with Fisher’s condition and that the potential energy 
function satisfies the condition of superstability (refer to formula (5-12a)). 
Similar conclusions, leading to the existence of the Gibbs distribution, 
hold in the case of the microcanonical and the canonical distributions 
where, moreover, one can establish the equivalence of the various equi- 
librium ensembles (i.e., the microcanonical, canonical and gand canoni- 
cal ones, along with other possible equilbrium ensembles), analogous to 


results outlined in sec. 5.4. 


Once a Gibbs distribution is specified, one can obtain physically relevant 
information in the form of average values of arbitrarily specified local 
observables from it. A local observable for any specified volume V is 
a function of the phase space variables (in I’) that depends only on the 
states of particles located within a finite volume V, regardless of the states 
of particles outside this volume. If Fy(r'!, p!%!) denotes the value of a 
local observable for N particles within V, located at specified points with 


specified momenta, then the average value of the local observable will be 
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given by 
1 
VY) = >a / died p Fy (21, p®)) fl), pl), (5-113) 
N 


where fim! denotes a local distribution (corresponding to the volume V 
and particle number NV) defined by the Gibbs distribution in question, 
and where the integration in each of the position co-ordinates in rl! is 
over the volume V while the integration in each momentum vector in p!! 


is over all possible momentum values with each component ranging from 


—oo to oo. 


Incidentally, by referring to the way the local distribution functions are 
obtained relative to the infinite volume limit of a grand canonical distri- 
bution (refer to formulas (5-110), (5-112)), and noting that the classical 
Hamiltonian H!%! (rl), p!’!) depends quadratically on the momentum vari- 
ables p; (i = 1,2,--- NV), one finds that their dependence on the momentum 
variables can be factored out as 


h? 


3N N p? =~ 
fr ee (5-114) 


Ee) ph ad 
where fi”! represents the position dependent part of f IN] and the pre- 


3N 2 
factor ( 2 is chosen so as to make fy dimensionless. Local dis- 


h2 
kat) 
tribution functions with a momentum dependence of the above form are 


said to be of the Maxwellian type ( [46], chapter 5.5). 


Thus, a Gibbs distribution (or a Gibbs state) is characterized in terms of 


local distribution functions of the Maxwellian type or, in other words, in 
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terms of functions FM el 1) related to the local distribution functions as 


in (5-114). 


5.6.2 Translation invariant Gibbs distributions 


A Gibbs distribution defined as in sec. 5.6.1 need not be translationally 
invariant. A translation invariant Gibbs distribution is one such that any 
and every local distribution function fy(r!™!, p!!) associated with it satis- 


fies, for any arbitrarily chosen vector x, 


fv(t1,82, °°: >In, Pi, P2,°°° Pw) = fy. (ti + x,ret+x,--- »!'n + X, Pi, P2,-*: Pw), 
(5-115) 


where V, is the volume obtained from V by a translation through the 


vector x. 


A translation invariant (or, simply, ‘invariant’ in brief) Gibbs distribu- 
tion has the significance of describing a homogeneous phase of a sys- 
tem (which may, however, be a statistical mixture of more than one pure 
phases, see below) while a Gibbs distribution that is not translation in- 


variant may correspond to co-existing phases. 


A statistical mixture of pure phases is not to be confused with physi- 
cally co-existing phases; the former describes a homogeneous system 
that has a certain probability for being in each of two (or more) pure 
phases while the latter corresponds to an inhomogeneous system 
made up of two or more coexisting pure phases separated by phase 


boundaries. Thus, one has the following possibilities: (a) a homoge- 


462 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


nous pure phase described by a translation invariant Gibbs distribu- 
tion, (b) a homogenous phase corresponding to a statistical mixture 
of pure phases described, once again, by an invariant Gibbs distri- 
bution, and (c) an inhomogeneous admixture of coexisting phases, 


described by a distribution that is not translation invariant. 


5.6.3 Translation invariant Gibbs distributions: varia- 
tional principle 
Translation invariant Gibbs distributions, which form a subset of all pos- 


sible Gibbs distributions for any given system, can be characterized in 


terms of a variational principle. 


To start with, we recall that the classical grand potential satisfies a min- 
imum principle, as mentioned in sec. 2.2.5. This principle can be stated 


as follows. 


Consider a normalized probability distribution in the phase space for a 
system of N particles confined in a volume V described by a distribution 
function f!! (rl), p!!) (in the notional phase space I for the system) and, 
for given values of V,G(= jy), the functional w[{f!!}], defined for any 


given set of such distribution functions { f!%!} = (fl, f?l,---), as 


ole =F fa aya™ py se pm) + bern #1 — uN), 
N=0 


(5-116) 


where, for any specified value of NV, the integration over the configuration 
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space in I involves, for each of the N particles, an integration over the 


volume V. 


One can now ask the question as to what the normalized distribution 
functions /f!‘! should be such that w/{ f!!}] is stationary, i.e., given a set of 
perturbed functions { f!*! + 6f'‘)} satisfying the normalization conditions, 
w[{ fl! + of) — wif fl] depends on the variations [{6f!*!}] only through 


terms of the second and higher degrees. 


The minimum principle we are after then states that w[f!"!] is stationary 
precisely when the distribution functions /!! are given, for various values 
of V, by the expression (2-93) (where the notation differs slightly; see also 
formula (5-110)). The fact that this stationary value is a minimum can be 
seen by evaluating the contribution to w[{ fl! + 5 f%)})] — w[{fl}] by terms 
of the second degree in the variations [{6f!‘!}], when one finds that this 


contribution is positive. 


The minimum value attained by the functional w[{ f!"!}] can be interpreted 
as C i.e., the grand potential per unit volume for a system characterized 


by parameters V, 7, wu. 


One can now generalize to an infinitely extended system so as to look 
for Gibbs distributions that satisfy a similar variational principle. For 
this we make use of the Maxwellian momentum dependence of the local 


distribution functions fy associated with a Gibbs distribution G(T, 4) as 
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in (5-114), and introduce, following [46], the functionals v[G], s[G], u[G] as 


2 : 1 N 
(eG)? = im So aaaew | iv ead 
N=0 
1 a. 2 
= slim af pied (5-117a) 
“ N=0 


oi 1 
MIC = Jim 3 paw f AM DD), pia nd p 
— N! 
SN EBT ptm celsly FM lly gi 
ee mam Lf elie (eae, (5-117b) 
— Ig k 
Gl — ar wan f fp) in ff", para ™p, (5-117c) 
N=0 ~ 


All the limits in the above formulas exist for hard core pair interactions 
(as also for interactions of more general types) between the particles mak- 
ing up the system. The above functionals acquire the interpretation of 
representing the particle density, the specific internal energy, and the 
specific entropy of the Gibbs distribution we finally arrive at, based on 


the following central result: 


Among all possible Gibbs distribution characterized by given values of 
T,, the translation invariant Gibbs distribution (let us call it G(T, 1); 
there may be one or more such distributions, see below) corresponds to 
the minimum value of of the u[G] — T’s|G] — wp, and this minimum value 
represents the state function —p (p = pressure) for the state (of the infinite 


system under consideration) represented by G(T, 1). 
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This result holds for systems with a hard core pair interaction as also for 


more general superstable ones. 


For given 7, ., the minimum value in the above variational problem may 
be realized either for a single translation invariant Gibbs distribution G 
or for a set of distributions, the latter being the situation corresponding 
to a phase transition. In the case of a phase transition, the minimum 
is attained for a set of translation invariant Gibbs distributions of the 
form G(T, 1;a) (0 < a < 1) where, in the case of a transition between two 


phases, G(T, ;a) can be expressed in the form 


G(T, ua) = aG,(T, p) + (1— a)G2(T, p). (5-118) 


In this expression G,(T, /),G2(T, 4) stand for two special translation in- 
variant Gibbs distributions that represent pure phases while G(T, 11; a) 
represents a statistical mixture (refer back to sec. 5.6.2) of the two pure 
phases with weights a,1— a. All the states with various possible values 


of a (0 < a < 1) are characterized by a single value of the pressure p. 


5.6.4 Gibbs states: the DLR distribution 


Gibbs states can be alternatively defined in terms of the so-called DLR 
(Dobrushin-Lanford-Ruelle) distribution. A DLR distribution is defined in 
terms of a set of local distributions /f\ for finite volumes V that vary from 
one local distribution to another, as in the case of local distributions in- 
troduced above. However, a local distribution function associated with a 


DLR distribution, giving the probability density for N particles with posi- 
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tions rl!) and momenta pl‘! within the volume V is conditional upon the 
positions of all the other particles (infinite in number) outside the volume 
V being fixed at specified points, for all possible choices of these fixed 
positions. This contrasts with the local distribution functions introduced 
above where there is no reference to the states of particles located out- 
side the volume V pertaining to a local distribution. A brief outline of the 
DLR theory will be found in sec. 6.2.5.5 in the context of the infinitely 


extended Ising lattice in dimension D > 2. 


5.6.5 Gibbs states: the KMS condition 


The Kubo-Martin-Schwinger condition is of fundamental relevance in the 
characterization of equilibrium states of infinitely extended quantum me- 
chanical systems. It is important to mention, however, that the classical 
version of the KMS condition has also been formulated. Here we will do 
no more that briefly stating what the KMS condition stands for [108], 
without going into the question of how the condition is employed in prac- 


tice. 


Referring to the trace of a product of observables pertaining to a quantum 
mechanical system with a finite dimensional Hilbert space, one can re- 
order the factors occurring in the product without changing the trace, 
though the operators themselves do not commute. Considering any two 


observables F',G, the ‘time correlation’ F'(t)G for an equilibrium state at 
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inverse temperature / is given by (Z = e-®, the partition function) 


A a 1 y iac7 a i4sn7 2 
(F(t)G) = gle Pen Fe WG) 
1 a . yA isn7 24 
= gen He na), (5-119a) 


and, likewise, 


= <Tr(ek*® Peat AG) (5-119b) 


where, in writing the second line of eq. (5-119b), we have introduced a 
couple of cyclic permutations in the ordering of the operators. The above 
two formulas, imply that there exists a function, say, wrc(z), of a complex 


variable z such that 


A A 


(F(t)G) = wrelt + ihB) 


A 


(GF(t)) = wea(t). (5-120) 


This pair of equations, which have been arrived at for a finite dimensional 
system, constitute the KMS conditions we were after. Significantly, these 
remain valid for an infinitely extended system as well, for which these 
characterize its equilibrium states (or, Gibbs states) at inverse temper- 
ature 6. The KMS conditions can be summarized by stating that the 
function wrc(z) is analytic in the strip 0 < Imz < (6 such that the ana- 
lytic continuation from the real line (Im z = 0), on which it takes the value 


(GF(t)), to the line Im z = f is possible, where it takes the value (F'(t)G). 
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An analogous condition can be formulated for a classical system as well 
by making use of the correspondence between quantum mechanical com- 


mutators and classical Poisson brackets, 


lim[F(t), G] > {F(t),G} = re = - = ne 


v 


|. (5-121) 


In this formula, F(t) is obtained from F'(0) = F' by means of a Heisenberg 
type classical evolution (move ahead to sec. 8.4.2 for necessary back- 


ground). 


In order to formally derive the classical version of the KMS condition, we 


write, from the second lines of formulae (5-119a), (5-119b), 


(F(t), G]) = = Tr([e-8*, PIG). (5-122) 


We now take the classical limit (5-121) [48] and, in the resulting expres- 


sion on the right hand side, make use of the formula 

fe?! Fl = —Se P" (HF, (5-123a) 
so as to obtain 

({F(t), G}) = B(G{F(t), H}), (5-123b) 


where now the observables are represented by classical phase space func- 
tions and the expectation values are with respect to the Gibbs distribution 
defined by the KMS condition itself. As in the quantum case, the formal 


derivation yields a valid formula for an infinitely extended system. 


469 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


5.7 Thermodynamic limit: phase transitions 


5.7.1 Phase transitions: introduction 


Thermodynamic systems exhibit the phenomenon of phase transitions. 
We consider here the phase transitions of the first kind in which the 
thermodynamic potentials of a system suffer a loss of smoothness (of a 
certain type) as a function of the relevant thermodynamic variables. For 
instance, the derivative of the free energy (F = —6~'ln Z,) with respect to 
the temperature becomes discontinuous at the transition temperature, 
while the pressure (p = ( _ or) ,) also develops a discontinuity of slope 
as as a function of the volume, with the p-V graph acquiring a horizontal 
part; the same phenomenon is also apparent in the form of a disconti- 
nuity in the entropy (S = ( — ey) as a function of temperature (refer to 


sec. 1.2.6, and to fig 1-2). 


The equilibrium ensembles of statistical mechanics lead to the thermo- 
dynamic variables only in the thermodynamic limit. In the case of a finite 
system, i.e., one away from the thermodynamic limit, one defines func- 
tions analogous to thermodynamic ones (in the sense that these satisfy 
analogous functional relations; refer to sec. 2.1.5, 2.1.6.2), but distinct 
from the latter in important respects, notable among these being the fea- 


ture of analyticity. 


The property of analyticity of a function f(z) at a point xp) requires 
that it possess a convergent power series expansion in some neigh- 


borhood of that point. 
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Thus, the canonical partition function for a finite system (Z. = S>e~°”), 
being a sum over exponentials, is an analytic function for all finite values 


of T, and so is the ‘free energy’, defined as F = —67'In Z,. 


One of the major concerns of statistical mechanics is to explain how 
phase transitions of the first kind can become a possibility in the ther- 
modynamic limit. It is this issue that will engage our attention in the 


present section. 


5.7.2 Phase transitions: outline of Yang-Lee theory 


The basic question here is essentially a mathematical one, into which 
important insights were provided by the pioneering work of Yang and Lee 
(see [100], sections 3.3.4, 3.4, [67], sec. 9.3, [66], chapter 5, chapter 
7, and [117], section S7C), which we now look at in brief outline. For a 


pedagogical and readable account of the Lee-Yang theory, refer to [11]. 


Consider the grand canonical partition function for arbitrarily chosen val- 


ues of V(> 0), T(> 0), u 
Z(V,T,u) = Sy ZAV,T,N) (y= 2%), (5-124a) 
N 


where the arguments in Z, and Z, are shown explicitly (in the following, 
we will suppress one or more of the arguments at places where there is 
little scope of confusion). For the sake of concreteness, we assume that 
the constituent particles in the system under consideration interact by 
means of a two-body potential that has a hard core, and an attractive 


tail of finite range, where the potential is bounded below. Yang and lee 
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considered a classical system of particles, in which case one can, on 
performing the momentum integrations in each term on the right hand 


side of (5-124a), express the grand canonical partition function as 


1 
Z(V,T,y) = > qu CnlV, 7), (5-12.4b) 
— Ni 


where Cy(V,T) is the configuration integral of (4-7), and the parameter y 


is defined as 


(5-124c) 


Note that analyticity in T, analyticity in y (for a given j), and analyt- 


icity in y are equivalent. 


For a given finite value of V, there exists a finite number, say M(V), of 
particles that correspond to dense packing, in consequence of which the 
potential energy becomes infinitely large for N > M(V) (owing to the hard 
core of interaction), and hence Z.(V,7T,N) = 0 whenever N > M(V). This 
means that Z, is a polynomial in y of degree at most M(V) with positive 
coefficients (Cy is positive by definition), and hence, cannot have a zero 
for real positive values of y, as shown in fig. 5-11. This, in turn, means 
that the grand potential Q(= —kpT In Z,) cannot exhibit a loss of analyticity 
for any real positive y (i.e., for any real positive value of the temperature 


T). 
Denoting the zeros of Z, in the complex y plane by y;,(V, 7) (k = 1,2,--- ,M), 
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Im y 


Figure 5-11: Illustrating the disposition of zeros of the grand partition function 
Z, in the complex y-plane for a system characterized by a hard core potential; 
for any finite volume V, the zeros stay away from the positive real axis; as V is 
made to tend to co, the zeros may close in at a point on the positive real axis as 
in fig. 5-13(A). 


one can express it in the form 
4 y 
Z,(V,T,7) = []G-—Sa): (5-125) 


(reason this out; refer, in particular, to the term independent of y in the 
polynomial in (5-124b) which, under the assumed conditions, is of degree 
M(V)). As V is made to tend to oo, there occur changes in the number and 
locations of the zeros y, (k = 1,2,--- , M(V)) in the complex y plane, none 
of which is located on the positive real axis, while complex roots occur 
in conjugate pairs. This implies the possibility that, in the limit V > on, 
a set of complex roots, densely located on some curve, pinch a point yo 
(say) on the real axis, i.e., lie arbitrarily close to yo on the curve. There 
then appear two segments on either side of y) corresponding to distinct 


thermodynamic phases as outlined below. 


The equation of state is obtained from the formulas for pressure and 
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volume per particle as 


ket = jim, V In Zips (5-126a) 
if . 1 oO 


In general, the two operations of going to the limit V — oo and applying 
the differential operator ey do not commute. However, for a system of 
particles interacting through a two-body potential of the type indicated 
above, Yang and Lee were able to prove the following ([144]): (a) for all real 
positive values of y, the limit on the right hand side of (5-126a) exists, is 
independent of the shape of the region whose volume V is made to tend 
to oo, and is a continuous and monotonically increasing function of y; 
(b) if, in a region R (in the complex y-plane) containing a segment of the 
real axis, there are no roots of Z, = 0 (such a region is shown in fig. 5- 
12(A)), then the limits in (5-126a), (5-126b) exist and are analytic in y. In 


particular, the operations limy_,,. and Ue commute in R, i.e., 


if a, 


5 Yay Gop) (5-126c) 


It follows that the pressure with all its derivatives with respect to y are 
continuous on the segment of the real y-axis intersecting R, and the vol- 
ume is a decreasing function of pressure at constant temperature, as in 


fig. 5-12(B) (refer back to (4-86) and the paragraph following it). 


If, on the other hand, the roots of Z, pinch the real y-axis at a point yo 
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Im y Dp 


(B) 


Figure 5-12: (A) Disposition of zeros of Z, in the complex y-plane in the thermo- 
dynamic limit for a single phase (schematic): there are no zeros in a region R that 
intersects the real axis in a segment, which corresponds to a single phase over 
the corresponding range of temperatures; (B) the pressure is a smooth function 
of y on the segment of the real axis intersecting R, and is a continuous decreas- 
ing function of the specific volume v. 

as V — oo (see fig. 5-13(A)), then the above conclusions cease to hold. In 
particular, 7 may become discontinuous (fig. 5-13(B)), when v acquires 
a jump discontinuity at y) and the p-V isotherm acquires a flat segment 
(fig. 5-13(C)), indicative of a phase transition of the first kind, with two 
distinct phases appearing on the two sides of y = y (corresponding to a 
transition temperature, say, 7 = 7) at a constant value of y). Alterna- 
tively, oe may be continuous at yo but oe (or a higher derivative) may be 


discontinuous, corresponding to a phase transition of the second order 


(or of a higher order). 


If the pinching occurs at two points on the real-y line (as in fig. 5-14) then 
one can have two successive phase transitions (for instance, a gas-liquid 


transition followed by a liquid-solid one as y is lowered), which degenerate 
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Rey 


i 


cS 


(B) 


(C) 


Figure 5-13: (A) Pinching; the zeros of Z,, some of which lie on the dotted line, 
pinch the real axis in the y-plane at a point y; there appear two regions R;, 
R2, whose intersections on the positive real axis in two segments correspond to 
distinct phases, with a phase transition occurring at yo; the pressure ceases to 


be a smooth function of y at yo; (B) the pressure is continuous but the derivative 


a is discontinuous at y = yo, corresponding to a phase transition of the first 


order where the p-v graph acquires a flat portion as in (C); if oe is continuous 


but a is not, then the phase transition is one of the second order. 


into a triple point if the points of pinching approach each other and then 


coalesce. 


Yang and Lee [145] applied the above considerations to the particular 
case of a lattice gas, i.e., to a system of particles placed at sites on an 
infinitely extended discrete set of points with a two-body interaction po- 
tential that is attractive in nature between particles located at distinct 


sites and is, additionally, a hard core one in that no two particles could 
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(; 


eooey ©eeee 


Figure 5-14: Pinching at two points y = y;, y = y2 on the positive real axis in the 
complex y-plane; there appear three regions Rj, Re, R3 in which p and v possess 
distinct analyticity properties, and the segments of the real axis intersecting 
these three regions correspond to distinct phases; in other words, there occur 
two successive phase transitions as y (or the temperature) is made to change 
through the points of pinching; in the special case in which the points y;, yo 
coalesce, there corresponds a triple point. 

occupy the same site. They did not make any additional assumption re- 
garding the nature of the lattice that might as well be an aperiodic one. 
This simple model also described a magnetic lattice, with a certain rule 


of correspondence between the gas model and the magnetic model. 


They showed that, in the limit of the number of particles in the lattice 
going to infinity, the zeros of the grand partition function were distributed 
densely on the unit circle in the complex y-plane (in the case of the lattice 
gas, the variable y stands for the fugacity y) with its center at the origin, 
such that the pinching occurs at the point y = 1 on the positive real axis, 
and the model admits of a single phase transition of the first order at 
y = 1. The distribution of zeros on the unit circle could be described in 


terms of a density function g(@) (0 < @ < 27) such that the pressure (p) 
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and the specific volume (v) are given by expressions of the form 


a | d0g(0) In(1 — 2y cos @ + y?) 
kpT Jo 


= 2y | d6g(@) 
0 


y — cos@ 
1 — 2ycos@ + y? 


(5-127) 


It can be seen from these expressions that the necessary singularity can 


occur only at y = 1, and that too if g(0) 4 0. 


The lattice model possesses a nice electrostatic analogy where one consid- 
ers a distribution of uniform line charges perpendicular to a plane (which 
we term the y-plane), the points of intersection of the line distributions 
with the plane being all located on the unit circle. The pressure and 
the specific volume for any specified value of the fugacity y in the lattice 
model then corresponds, in the electrostatic analogy, to the electrostatic 
potential and the field strength at the corresponding point in the y-plane. 
The potential and the field strength develop a singularity (i.e., a lack of 
smoothness) at y = 1 as the density of the line charges on the unit circle 


goes to infinity. 


5.7.3 Phase transition of the first kind: summary and 


overview 


The thermodynamics of phase transitions has been explained with incisive 
clarity in [15] (chapter 9), where the role of fluctuations in the actual occur- 
rence of a phase transition has also been pointed out ([15], chapter 19) with 


characteristic precision and lucidity. 
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In a phase transition of the first kind, one has two competing phases, 
each described by a thermodynamic fundamental relation (5S as a function 
of U,V, N in the case of a simple fluid), from which one can work out other 
thermodynamic functions. Of particular relevance in the case of a fluid 
is the Gibbs potential G(p,T, N) (since phase transitions are commonly 
observed under conditions of constant pressure and temperature), while 
other thermodynamic potentials, such as the free energy or the grand 
potential, can also be invoked to describe the transition between the two 


phases. 


Just as the free energy and the grand potential are thermodynamic poten- 
tials relating to the canonical and the grand canonical ensembles, the Gibbs 
potential is related to the pressure-temperature ensemble, in which the vol- 
ume V of the members of the ensemble as also their energy are fluctuat- 
ing variables. Such an ensemble represents a system in interaction with 
a reservoir characterized by some particular pressure (p) and temperature 
(T), both of which are assumed to remain unchanged in the interaction. 
The Gibbs potential (G(p,T, N)) is related to the free energy (F'(V,T,N)) by a 


Legendre transformation 


CL Faw. (P)n=¥. (5-128) 


Considering either of the two competing phases, the equilibrium state 
(call it S) for given values of p,7, N corresponds to a minimum value of G 
as compared with nearby states characterized by fluctuations of various 
magnitudes of V,U,where each such state can be looked upon as one of 
constrained equilibrium that goes over to the unconstrained equilibrium 


state S when the constraints are removed. Generally speaking, for given 


479 


CHAPTER 5. THE THERMODYNAMIC LIMIT 


values of p,T (we suppress the reference to N for the sake of brevity) one 
obtains two distinct values, say, Gi(p,T), Go(p, 7) of the Gibbs potential, 
each corresponding to one of the two competing phases, of which one is 
lower than the other, thereby determining which of the two phases is the 
thermodynamically relevant one for those given values of p,7. Starting 
from a situation in which G, is lower than G), if either of the two rele- 
vant variables (say, 7) is made to vary (with p assumed to be held fixed 
for the sake of concreteness) in such a way that the difference G, — G, 
gets diminished, there eventually arises a situation when the two values 
of G, evaluated for the two phases, become equal. This marks the con- 
dition for the coexistence of the two phases, and as T is made to vary 
further in the same direction, G; now becomes lower than G,, indicating 
that the system has now switched over to the other thermodynamically 
relevant phase in a first order phase transition. The transition point is 
characterized by a discontinuity of the derivative Cae corresponding to 
a discontinuity in the specific entropy of the two phases (a second or- 
der phase transition occurs when the value of the pressure p, which was 
assumed above to be held fixed, is chosen in such a way that the sec- 


aG 


Te oc) 


) becomes discontinuous while the first derivative ( es 


ond derivative ( oe 


remains continuous). 


One can equally well describe the entire scenario of a first order phase 
transition in terms of the free energy F (refer to section 1.2.6 and figure 1- 
2, and also to sec. 5.4.2 and fig. 5-10), or the grand potential 9 (refer 
to sec. 5.7.2), because all these descriptions become equivalent in the 


thermodynamic limit. 
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As seen in sec. 5.7.2, a phase transition of the first kind is marked, from 
the mathematical point of view, by the phenomenon of pinching where the 
zeros of the grand partition function considered in terms of the fugacity 
+ (or the variable y defined in (5-124c)) close in on to some point on the 
positive real axis of the complex y-plane, being located densely on along 
some line intersecting the positive real axis. In the case of a lattice gas (or 


an equivalent magnetic system), the relevant zeros lie on the unit circle. 


The van der Waals equation is of some considerable relevance in the phys- 
ical understanding of phase transitions, since it was the first theoreti- 
cal model of phase transition and of critical phenomena (refer to fig. 4-6 
where the critical point is indicated) with a correct orientation that can 
be interpreted in physical terms. As we saw in section 4.1.3.2, the van 
der Waals equation is obtained from the virial series for a dilute gas by 
truncating the series at the second virial coefficient for the square well po- 
tential. The latter implies a short-range attraction among the molecules, 
and one concludes that the interplay between the attractive and repul- 
sive forces (the latter arising when molecules tend to overlap with one 


another) is responsible for the phase transition. 


A phase transition is also possible under the action of the repulsive 
forces alone, as in the case of the Kirkwood transition observed in 
numerical simulations of the dynamics of a hard sphere fluid (refer 
to section 4.4); the van der Waals equation is also obtained from a 
perturbative treatment starting from a hard sphere fluid, as indicated 


in section 4.5.7. 
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The range of the attractive interaction has subtle implications regard- 
ing the possibility of a phase transition. While the square well model 
refers to an attractive part of the interaction that is cut off at some fi- 
nite distance between the molecules (more generally, one considers an 
exponentially falling attractive potential), the mean field theory implicitly 
requires an attractive interaction that is of sufficiently long range so that 
a large number of molecules may, at any instant, lie within the sphere of 
action of any chosen molecule, as a result of which the potential felt by 
the chosen molecule may be represented by the average effect (the ‘mean 
field’) of all these other molecules. In particular, an algebraically falling 
attractive potential may not be inconsistent with the possibility of a phase 
transition, though exponentially falling potentials (or ones cut off after a 
finite distance) are commonly assumed in concrete models that examine 


the problem of phase transitions in systems. 


One dimensional systems. 


There exists an extensive literature on the phase transition problem in 1D 
and 2D models where a number of rigorous results have been obtained, 
most of those relating to molecules (or interacting units) residing at dis- 
cretely distributed points on a lattice — specifically, to magnetic systems. 
The dimensionality (D = 1,2,---) is important in that it affects the possi- 
bility of a phase transition in a marked way. In particular, there exists a 
strong constraint against the occurrence of phase transitions in 1D sys- 
tems with short range attractive interactions. As shown by Van Hove in 
an early study (see [24] for an illuminating review of the 1D phase tran- 


sition problem), phase transitions are ruled out in a homogeneous 1D 
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continuously distributed system with a short range attractive interaction 
(one with a finite cut-off) and a hard core repulsion (this effectively corre- 


sponds to a square well potential), in the absence of an external field. 


A homogeneous system is one made up of identical interacting units. This 
excludes inhomogeneous systems such as disordered ones. Disordered sys- 


tems will be considered in section 6.5. 


Van Hove’s result is echoed by studies on discretely distributed 1D sys- 
tems with short range attractive interaction, including the Ising model 
in zero external field, which is a magnetic system where an attractive 
interaction means one favoring aligned ‘spins’. An argument due to Lan- 
dau, when interpreted in the context of the 1D Ising model, points out 
that a transition from a disordered to an ordered phase is not possible 
at any finite temperature due to the competition between energetic and 
entropic factors in which the latter always wins out. More specifically, 
in any imagined transition from a disordered to an ordered phase, the 
latter immediately breaks up into a configuration involving alternating 
segments of oppositely aligned spins since the energy cost of creating two 
such segments with a boundary (a ‘domain wall’) is small compared with 


the entropic gain. 


As regards the finite cut-off of the range of the attractive interaction, rig- 
orous results on 1D systems indicate that phase transitions need not be 
ruled out for pair interactions falling off more rapidly than the inverse 
squared separation between the interacting units (such a fall-off is suffi- 


cient for the existence of the thermodynamic limit). Further, interactions 
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with an external field (or, more generally, the presence of terms in the 
Hamiltonian that resemble the effect of an external field) may favor phase 
transition in certain 1D systems, as revealed in studies on model systems 
that may have relevance in numerous problems of physical interest (e.g., 
problems relating to macromolecular processes and those involving sur- 


face depositions and surface wetting; refer to [24)]). 


From the mathematical point of view, the partition function of a lattice 
system can be related to the largest eigenvalue of the Nth power of a 
transfer matrix (N = number of interacting particles or units in the sys- 
tem, which goes to oo in the thermodynamic limit), as we will see in chap- 
ter 6, sections 6.2.2.1 and 6.2.4.5. If the largest eigenvalue turns out to 
be non-degenerate then a phase transition is ruled out, while a cross- 
ing of the largest eigenvalue with the next largest one implies a phase 
transition. In the case of a continuously distributed system, one needs 
to consider an integral operator instead of a matrix, whose largest eigen- 
value is then seen to be relevant in determining the possibility of a phase 


transition. 


Kac ([70]) considered such a continuously distributed 1D system of par- 
ticles with an exponentially decaying attractive interaction (V(x) = ae~” 
for x > 6) in addition to a hard core repulsion (of range 6), and could ex- 
actly solve for the largest eigenvalue problem in the thermodynamic limit, 
which was seen to be non-degenerate and smooth in s = ;4> (usual no- 
tation). However, when the strength of the repulsive interaction (a) was 


taken to be proportional to the range (a = agy, ap = constant) and the limit 


7 — 0 was taken (this is referred to as the van der Waals limit, in which 
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the range of the attractive potential goes to infinity while the strength 
goes proportionately to zero) after the thermodynamic limit, it was found 
([71], see also [46], sec. 5.2.) that a set of eigenvalues, along with the 
largest eigenvalue o(s) collapsed on to a limiting value at a certain criti- 
cal temperature (corresponding to a critical value of s), thereby implying 
a phase transition at this temperature. What is more, the equation of 
state in the van der Waals limit turned out to be precisely the Van der 
Waals equation with Maxwell’s equal area rule (refer to section 4.1.3.3) 
built into it. This is in conformity with the observation made earlier that 
a mean field theory can acquire validity in the limit of an infinite range of 


the attractive tail of the interaction potential. 


The constraint operating against the occurrence of phase transitions in 
1D systems, ceases to be operative for 2D systems (as also for systems in 
higher dimensions). The 2D Ising model is of great relevance in statistical 
mechanics since it was the first system for which a phase transition could 
be demonstrated on a rigorous basis. We shall have a look at this model 
in section 6.2.4. Dobrushin ([28]; see also references therein for similar 
work on other, related systems) established the occurrence of a phase 
transition in a lattice gas model with an appropriate attractive interaction 
between sites, where the theory was shown to work for lattice dimension 
two or more. In the case of a 2D lattice, the potential was assumed to fall 
off faster than the inverse fourth power of the separation between lattice 


points. 


The topic of phase transitions will be pursued further in chapter 6 (sec- 


tions 6.2.4,6.2.5, 6.3, 6.4), where we will have a brief look at phase tran- 
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sitions of first and second kinds in magnetic lattice systems, including 
critical phenomena. We also touch upon the important topic of phase 
transitions in disordered lattices systems in section (6.5) of the same 
chapter. Phase transitions in bosonic and fermionic systems will be taken 


up in chapter 7. 


For a mathematically rigorous account for the problem of phase transitions, 


see [121]. 


5.8 Thermodynamic limit for correlation func- 
tions: the Kirkwood-Salsburg equations 


Distribution functions for dense systems were introduced in section 4.5.2 
in terms of the canonical ensemble. We will now follow an alternative 
definition of these quantities in terms of the grand canonical ensemble 
and then state the Kirkwood-Salsburg (KS) integral equations satisfied by 
these functions. The latter constitute a useful set of equations to describe 
the behavior of a system in the low density and high temperature regime 


as the thermodynamic limit is approached. 


In section 4.5.2, we were mostly interested in the high density (regime), for 
which the correlation functions, defined in (4-65) in terms of the distribu- 
tion functions (eq. (4-63a)), were of greater relevance. In the present section 
we focus on the low density regime, introducing the grand distribution func- 
tions (referred to as the ‘correlation functions’ in [121]; such variations in 


nomenclature from one source to another is common in the literature — one 
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has to make out the sense from the context unless explicit definitions are 


stated) in (5-129) below. 


Referring to the grand canonical distribution describing the mixed sta- 
tionary state of a fluid of activity z(= =) at temperature T(= (kp3)~'), 
and to the potential energy function ®!"!(r!!) for N-particle interaction, 


the grand distribution function 7"! is defined as 


M j 
1 gM _ Bel] 
1] (r1) = 7. ) / | | dF IN Be 7 (5-129) 
8 M>NY’ i=N+1 M , 


where the integrations are performed over the volume V of the system. We 
assume that the interaction between the particles making up the system 


under consideration is a resultant of two-particle central potentials ¢: 


BM) = SY 7 orig) (rg = [ts — 4). (5-130) 


1<i<j<N 


The relation between the distribution functions based on the canonical en- 
semble (p!"|(r!"!) in the notation of sec. 4.5.2), and the grand distribution 
functions based on the grand canonical ensemble (7/1 (rl!) in the notation 


to be followed in the present section) is made explicit as follows. 


The distribution function p!"!(rl"!) of sec. 4.5.2 represents the probability 
density that n number of arbitrarily chosen particles are present around the 
point r!”] in the configuration space of a system made up of a fixed number 
(N) of particles. However, if the number JN itself is variable, as in the case 
of an open system characterized by chemical potential , then one has to 


average over N, for which the probability is P(N) give in (4-87). If, then, 
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one works out the average of the expression in (4-63a) over various possible 
values of N (with probability distribution P(V); only the values N > n need 
be considered), one obtains the grand distribution function for n particles, 
regardless of the total number of particles (NV > n). The expression (5-129) 
gives the grand distribution function for N particles in a fluid where the total 


number of particles (1/(> N)) varies with probability distribution P(1/). 


The KS equations relate the grand distribution function 7! (r!%!) = nl (r,, 12, --- , rn) 
to similar grand distribution functions involving sets of position co-ordinates, 
where one particular co-ordinate (say, r;; any other choice would do 

equally well) is absent, i.e., it effectively describes the way the interac- 

tion of the particle at r; with all the other particles in the system, which 

may be imagined to be augmented by the addition of particles other than 

those involved in the definition of p'*!, influences the grand distribution 


function p'’!, It involves the ‘kernel’ K, (s = 1,2,---) defined as 


N+s 


Alt t ya. * ohn) = I] (ePelru:) —1). (5-131) 


i=N+1 


The correlation functions of sec 4.5.2 and the grand correlation functions 
of the present section coincide with one another in the thermodynamic 
limit. The existence of the thermodynamic limit itself is conveniently demon- 
strated in terms of the grand distribution functions, as mentioned at the end 


of the present section. 


Thus, the kernel, which is independent of the volume V, involves the 
interactions of the chosen particle at r,; with all the particles in the aug- 


mented system, excluding the group of particles located at positions r2,--- ,ry 
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(i.e., the ones specified in 7!%!(r!"!)). We also define 


olbN(ry,-+- rw) = > O(n); (5-132) 


which involves the interactions of the chosen particle with all the other 


particles within the group located at r;,--- , ry. 


I will now state the KS equations without proof (see [66], chapter 6, for a 


clear exposition): 


LN (ey on _ 1 
nNM(r1, 179, -°° ty) Ssgenel Nira, =n) [gl gga tw) + Dog 


s>1 


c i Ory yy drys Keoltistnaa << tse) (rg, ens] 


SAD 22), (5-133) 


These equations can alternatively be expressed in terms of functions 
Cel) = p-Nyll(rl1), termed the grand correlation functions, analo- 
gous to the correlation functions in section 4.5.2 in the context of dense 


fluids (note the slight change in nomenclature). 


Equations (5-133) constitute a linear inhomogeneous system of integral 
equations for the correlation functions 7"). Starting from these equations 
one can obtain a series expansion for the correlation functions and estab- 
lish the convergence of the series in the thermodynamic limit for systems 
characterized by a stable and regular pair potential ([121], chapter 4) and 
for sufficiently low p,(, where p stands for the density. This implies, in 


particular, that there can be no phase transition in this regime. 
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Stability of an interaction has been defined in sec. 5.2.2.1 (see formula (5-7)). 


A Pair potential ¢(r) is said to be a regular one if 


ih d@rle“F9™) — 1] < 00, (5-134) 
0 


for 6 > 0. 


The grand correlation functions, as obtained from the KS equations, pro- 
vide us with a complete description of the behavior of a system at suffi- 
ciently low p, 3, and is at times a more convenient route to such descrip- 


tion as compared with the virial expansion introduced in section 4.1. 


Analogous to the situation relating to a random variable, all of whose mo- 
ments determine its probability distribution, grand correlation functions for 
all possible values of N determine the grand partition function Z, of a clas- 


sical fluid and are sufficient to determine all its thermodynamic parameters. 
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Chapter 6 


Statistical Mechanics of 


Interacting Systems II 


6.1 Interacting Systems II: Introduction 


In this chapter we continue to outline the statistical mechanics of a few 
interacting systems, focusing on spatially discrete ones based on underly- 
ing lattice structures. In contrast to fluids discussed in chapter 4, lattice- 
based models admit of a number of simplifying features, and offer a few 
exact results that serve as benchmarks for more realistic models where 


only approximate results, based on numerical schemes, are available. 


Of particular relevance are lattice-based spin systems that serve as ideal- 
ized models of real life magnetic substances where, at the same time, the 
models can be mapped to ones relating to other types of systems such as 
binary alloys and lattice gases. These are of paramount significance in 


statistical mechanics in that they provide much insight into the statisti- 
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cal mechanics of infinitely extended systems, which includes a great deal 
of understanding of the phenomenon of phase transitions. In this sense, 
the present chapter complements the material presented in chapter 5 by 


way of providing concrete results illustrating general principles. 


We consider lattice-based models of two markedly distinct types — those 
that do not involve built-in disorder, and ones that do. The latter pro- 
vide us with some understanding of disordered systems — ones of vast 
relevance in real life phenomena. While there exist disordered systems of 
various descriptions, we focus on one particular class, namely, the spin 
glasses, where the disorder is of the quenched type. The understanding 
of the behavior of disordered systems, including that of the spin glasses 
(see sec 6.5), is much less advanced than that of spin systems without 


built-in disorder (to be referred to as regular spin systems). 


Among the regular spin systems and spin glasses, we concentrate on Ising 
type systems, where an Ising type system involves classical spins, located 
at points making up a discrete lattice, interacting with one another in 
some specified manner, the interaction being mostly of the short range 
type. Of paramount relevance in this context is the 2-dim Ising model 
on a square lattice (the term ‘Ising model’ is usually meant to imply a 
nearest-neighbour interaction among the spins, while Ising type models 
with long-range interactions are also of relevance, especially in connec- 
tion with spin glasses), where exact results are available for the infinitely 


extended lattice. 


A classical particle in, say, one dimension, is characterized by two con- 
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tinuously varying state variables (the position co-ordinate 7, and the mo- 
mentum p) while, in contrast, a classical spin is a fictitious object whose 
state is described by a discrete variable o that can take up, say, s number 
of values, the value of s being commonly taken to be 2, corresponding to 
which the variable o is assumed to take up values +1 (spin‘up’ and spin 
‘down’ states). A system made up of a number (say, JV) of classical spins 
has its states labeled with N number of discrete commuting variables 
01,02,:°:,on, each having possible values +1, and the state variables (or 
‘observables’) of the system are functions of these, a typical state variable 
having the form of a sum of terms, where each term is a product of the 
o; 8. Infinitely extended systems of classical spins provide us with deep 


insights into the statistical mechanics of phase transitions. 


In the present chapter we will outline a number of results on the 1-dim 
and the 2-dim Ising model, where these will be placed in the context of 
results that hold for D-dim lattices (D > 2) and, in the process, briefly 
sketch a few ideas relating to infinitely extended systems, the latter being 
continuations of statements made in sections 5.6 and 5.7 of chapter 5. 
We will then go on to introduce idealized spin glass systems and briefly 


state a few relevant results pertaining to these. 


The next chapter will be devoted to quantum mechanical interacting sys- 
tems made up of identical particles, these being of particular relevance at 
low temperatures, where remarkable phenomena like superfluidity and 


superconductivity are found to make their appearance 
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6.2 The Ising model 


6.2.1 The Ising model: Introduction 


The Ising model (in dimension D > 2) constitutes the simplest non-trivial 
example of a system exhibiting a phase transition. The dimension D of 
the underlying lattice is relevant because it determines the multiplicity of 
pathways between any two given spins in the system that determine how 
the state of one of the spins can affect that of the other. As the dimension 
is made to increase, the connectivity between spins also goes up rapidly, 
leaving an imprint on the system behavior as understood in the context 
of statistical mechanics. The 1-dim Ising model has an exact but trivilal 
solution that tells us that it does not undergo a phase transition (refer 
to section 5.7.3). This solution will be given below so as to indicate the 
basic idea underlying the method of transfer matrices. This method, when 
invoked in the context of the 2-dim Ising model leads to a highly non- 
trivial exercise that indicates the presence of a phase transition. It is this 
crucial result that initiated a phase of stupendous activity in statistical 


mechanics that continues till date. 


We begin our study of the Ising model by first working out the exact solu- 
tion for the 1-dim case in the presence of a magnetic field. We follow this 
up by by outlining the mean field theory for the Ising model that gives 
important insights into the Ising problem for dimension D > 2 (the mean 
field theory is a general strategy of great relevance in diverse problems in 
statistical mechanics; refer back to section 4.1.3.3). We will then state 


a number of exact results relating to the 2-dim problem, mostly without 
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proof, stressing instead on the basic ideas involved. Though the 2-dim 
problem in the absence of a magnetic field yields an exact solution, many 
of the results are special cases of D-dim models with D > 2, and we 
will state higher dimensional generalizations of some of the resuts even 
though exact solutions are not available. We will conclude by outlin- 
ing a number of basic ideas in the theory of infinitely extended systems 
and that of phase transitions in such systems (refer back to sections 5.6 


and 5.7), confining ourselves to the context of the Ising model. 


The dimension of space in which a model of an interacting system is defined 
and analyzed acquires relevance in determining such features as the possi- 
bility and nature of phase transitions in it, and the universality class of the 
model as revealed in the critical constants characterizing a phase transition. 
From the physical point of view, the connectivity among the constituents of 
the system described by a model determines the relative importance of fluc- 
tuations in the system as compared with the co-operative ordering effects 


that the interactions along the various possible pathways may have. 


The Ising Hamiltonian in a magnetic field h is 
H=-J)/00q—-h op + Hp, (6-1) 
(pq) P 


where the indices p and q indicate individual lattice sites and (pq) indi- 
cates a constraint that the summation is to be carried out first by fixing 
p and summing over the index gq corresponding to all the nearest neigh- 
bours of the site p, and then summing over p running through all the 


lattice sites, taking care that the interaction between each pair of sites is 
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to be counted only once. Depending on the structure of the lattice, each 
of the indices p,q may correspond to one or several integer values. In the 
above expression, o, for p corresponding to any lattice site stands for the 
spin variable located at that site, having possible values s, = +1. Further, 
in the above expression J stands for the strength of interaction among 
the nearest neighbour spins where we will assume J > 0 (‘ferromagnetic 
interaction’; J < 0 corresponds to an ‘antiferrognatic’ interaction), and h 
for the strength of an ‘external magnetic field’ giving rise to an additional 
energy term for each individual spin. Finally, Hy stands for terms aris- 
ing from the boundary condition, where some of the commonly occurring 


boundary conditions will be indicated as we go along. 


Boundary conditions will be of particular relevance in our considerations 
below. As we will see, if we start from a finite system with given boundary 
conditions and go over to the themodynamic limit (maintaining, in some ap- 
propriate sense, the boundary conditions) at any specified temperature then 
we may possibly end up with multiple states of the system, depending on the 
boundary conditions chosen. The thermodynamic quantities like the spe- 
cific free energy and the specific Gibbs potential, however, are independent 
of the boundary conditions, as implied by the statement that the thermody- 
namic behaviour of the system is independent of the boundary conditions 
(equivalent to the statement relating to the existence of the thermodynamic 
limit). The existence of multiple states at any given temperature is indicative 


of a phase transition. 


In attempting to solve or analyze any specific lattice problem, one has 
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to supplement the formula (6-1) with the following information: (a) the 
dimension and the structure of the lattice; in the following, we will mostly 
consider 1-dim and 2-dim lattices where, in the latter case, we confine 
ourselves to a simple square lattice in a plane; (b) the specification as 
to whether the lattice is a finite or an infinitely extended one where, in 
the former case, we have to further specify the number (N) of spins in 
the lattice and the boundary conditions that determine the term Hg; in 
certain cases it may not be necessary to specify explicitly the boundary 
term of the Hamiltonian, it being effectively indicated by the choice of the 


range of the summation in ae, 


6.2.2 The 1-dim Ising model 


6.2.2.1 1-dim Ising model: the transfer matrix and the free energy 


A. Periodic boundary condition 

In a finite 1-dim Ising model containing N spins (N > 1), the indices p,q 
of (6-1) range from 1 through N. We consider first the model with periodic 
boundary conditions where the spin oy is assumed to have, as its nearest 
neighbours, the spins oy_; and oj, which is equivalent to assuming that 
the spins are placed on a circular lattice, with oy flanked by oy_; on one 
side and o, on the other. The sum }/,,. 0,)7,, when written out in full, 


then reads 
1 
> Op0q = 5 (orlon + 02) + o9(01 + 03) +--- + on(on-1 + 01)). (6-2) 
(pq) 


In other words, we set Hg = ono, in (6-1), with the indices p,q ranging 
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from | through JN in the linear 1-dim lattice we started with. 


The partition function then appears in the form of a product 


= oD exp|3(Js152) + shor + S9)]---exp[G(Jsys1) + sh(sw + 51) 


{s1,82,° sn} 


N 

= [Je BU spspe1 + 5 hlsy + 5p) (6-3) 

{81,89,-: 8n} P=1 

(check this out) where, in the last line, we define sy, = s,; (periodic 
boundary condition), and the summation is over all possible spin con- 
figurations, s, being the spin value (+1 or —1) corresponding to the state 
(‘up’ or ‘down’) of the spin placed at site p of the lattice (p = 1,2,--- , N), i-e., 
the value of the spin variable o,. Evidently, the total number of possible 


spin configurations is 2”. 


We now define the 2 x 2 transfer matrix P with elements P;; (1 < i,j < 2) as 


P. — eB(Ispsqt 3 (sptsq)) (1 < D,g < N), (6-4) 


prq 


(—s, + 3) (and 


where the index i, is obtained from spin value s, as i, = } 
similarly, i, = }(—s, + 3)), so that i, (resp. i,) is 1,2 when s, (resp. s,) is 


= +1,-—1. We can then write the partition function in the form 
Z = > pate ate ue Paws (6-5) 
{i1,t2,--- tn} 


where the summation is over all possible sequences {i1,i2,---,iy}, in 
which each term can have value either 1 or 2 independently of the others, 


there being in all 2% such sequences. Formula (6-5) tells us that the par- 
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tition function Z is nothing but the trace of the Nth power of the transfer 


matrix P: 
2= TP”), (6-6) 


According to (6-6), the transfer matrix P reads 


ebU+h) BJ 
= (6-7) 


(check this out). Its eigenvalues are 


As = e® cosh(Bh) + (e287 sinh?(Bh) + e~29")2, (6-8) 


(check this out too), A being the larger of the two eigenvalues. 


The free energy F’ then works out to 
F=—6' Int +A%) = -6 tpt (1+ =)" (6-9) 
+ 


Since A; > A_, we find that the limit, as N — ov, of the free energy per 


spin (4) exists and is given by 
j= slim ~ = —6~' In[e*’ cosh(Gh) + (e797 sinh?(8h) + e284) a), (6- 10a) 
which simplifies, for h = 0, to 


f =—68 ' Inle™ + e-?"]. (6-10b) 
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This has no singularity in 6 for any positive finite temperature, which 
tells us that the 1-dim Ising model does not show a phase transition, in 


keeping with what we saw in section 5.7.3. 


At any fixed temperature and for a non-zero magnetic field h, one observes 
from the expression (6-10a) that the free energy varies smoothly with the 
field strength h at all finite positive 5. This feature will be seen to persist 


for the 2-dim ising model as well. 
B. Free boundary condition 


The existence of the thermodynamic limit requires that the free energy per 
spin is to be independent of the boundary conditions. Considering, for 
instance, the 1-dim Ising model with free boundary conditions, in which 
case the spin at site p = 1 interacts with only that at site p = 2, and 
similarly, the spin at p = N interacts only with that at p= N-—1. The 


partition function is now found to be 


Gi 3 CrP pa (6-11) 
i1,in=1 

where P is the transfer matrix of (6-7), and s,,s, are spin values related 
to i,;,iy as explained in the paragraph following (6-4) (check this out). 
One again finds that limy.. £ exists and this limit is, moreover, found 
to be the same as in (6-10a), thus corroborating that the thermodynamic 
functions are independent of boundary conditions used. As we will see 
below, this statement applies to the 2-dim Ising model as well, with the 


qualification that the state of the system in the thermodynamic limit may 
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depend on the boundary conditions. 


Let S be the orthogonal 2 x 2 diagonalizing matrix of the real symmetric 


transfer matrix P, whose columns are made of the normalized eigenvectors 


UL UR 


UL U— 
and corresponding to the eigenvalues +, A_ respectively (note 


that, though we are considering free boundary conditions now, P is never- 
theless the transfer matrix defined for a model with periodic boundary con- 
ditions). One then finds from(6-11), that Z = e~°7[(SAnyS7)11(SAyS$*)o9] + 
e®4[(SAnS7 )o1(SAnS7 )12], where Ay is the diagonal mamtrix with diagonal 


elements \1",\%, ie., STPS = Ay. 


One can then work out the thermodynamic limit for the 1-dim Ising model 

with free boundary conditions by considering limy_,.. 4 .1nZ and by making 
N 

use of the fact that limy_.., a = 0 and obtain the expression for the spe- 
of 


cific free energy that turns out to be identical to the expression (6-10a), as 


mentioned above. 


6.2.2.2 1-dim Ising model: the mean magnetic moment and the 


spin-spin correlation 


A. The mean magnetic moment. 


Considering a 1-dim Ising model with N spins and with periodic bound- 


ary conditions, the mean value of the spin variable co, at any site q(= 


1,2,--- , N) is given by (refer to the way the formula (6-3) was arrived at) 
1 ~ 1 
(04) = Z >» Sq II exp [B(JspSp41 + gltlsp + Sp+1))]. (6-12) 
{s1,52,°,8n} p=l 
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Because of the symmetry introduced by the periodic boundary condition, 
this has the same value for all o, (q = 1,2,:-- , N) (reason this out; notice 
the distinction between the spins co, and the spin values s,), in view of 
which we write the above mean value as o and, for the purpose of its 


evaluation, take gq = 1 in the right hand side of (6-12). 
In the case of the free bondary conditions, (o,) depends on the site index 
q for any finite N, but this dependence disappears in the thermodynamic 


limit. 


Introducing the transfer matrix P of (6-7), one obtains 


2 
i N 
(c) = oe Janérs (6-13) 
(check this out), where s;(= —27; + 3) is the spin value at site q = 1, 


corresponding to row (or column) index i;(= 1,2) of the matrix 7. 


One can evaluate the above expression for o (which we call the mean mag- 
netic moment, though in an actual magnetic model this is to be multiplied 
with the elementary magnetic moment (say, ) of a spin so as to give the 
mean magnetic moment; the two possible spin states correspond to mag- 


netic moments +y) by referring to the normalized eigenvectors (written 


Uy Ue 
as columns, and ) of the matrix 7, and to the diagonalizing 


UL UL 


matrix S, introduced above in connection with (6-11). On going over to 
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the thermodynamic limit and making use of limy_,., ae = 0, one obtains 
+ 


e®/ sinh(Gh) 
(e-26) + 28) sinh?(Bh))2 


(o) = $7, - 53, = uw vy a ’ (6-14) 
(check this out). One observes that, at any finite temperature, (c) is an 
odd function of the field strength h, having the value 0 at h = 0 (no spon- 
taneous magnetization, implying the absence of long range correlation 
between spins). In the high temperature limit (71 — oo), the mean mag- 
netic moment goes to zero ((c) — 0) due to a complete lack of spin-spin 
correlation while, in the limit of zero temperature (T — 0), the mean mag- 
netic moment attains the value +1 or —1 depending on whether the field 
h is positive or negative. Finally, for J = 0 (non-interacting spins), one 


obtains (co) = tanh(3h) (paramagnetic limit). 
B. The spin-spin correlation 


Referring once again to the finite 1-dim ising model with periodic bound- 
ary conditions, one can calculate, in a similar manner, the correlation 


between any two spins, say, o, and o, with a separation r between them 


(r = |p — q|), where r can take up values 1,2,---,N—-—1. In virtue of the 
symmetry resulting from the periodic boundary conditions, this correla- 
tion depends on p,q only through the separation r (in the case of the free 
boundary conditions, the correlation depends separately on p and q for 


any finite value of NV, but depends on r alone in the thermodynamic limit, 


in which case we can set, for the sake of concreteness, p = 1, g = 1+17) 
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and is given by 


eS im (pF p+r) — (Op) (Fp+r)) 
dd. . e 2B IF 
= (c28/ sinh?(Bh) + e-28/) 2 


CaS ede es eee), (6-15) 


One observes that, at any given values of £,h, the correlation I”) decays 


at large separations r as 


mls 


Bae" (6- 16a) 


where the inverse correlation length €~—', defined as 


é1= lim ee inl], (6-16b) 
Too r 
is found to be 
c =m. (6-16c) 


Thus, € > 0 as T > ow and, ath=0, 


1 


- In coth(3J)’ orn 


€ 


which goes to infinity as T > 0. 
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6.2.3 Ising model: the mean field theory 
6.2.3.1 The mean field Hamiltonian 


The mean field approach is one of substantial relevance in statistical me- 
chanics in that it provides us with an overview of the thermodynamic 
behavior of a system described by some model for which no exact solu- 
tion is available. In the case of the Ising model in dimension D > 2, the 
mean field theory predicts the existence of a phase transition at a non- 
zero finite temperature (the prediction of the theory fails for dimension 
D = 1 where the exact calculation tells us that no phase transition is 


possible). 


Following [41], chapter 2, we refer to the interaction energy, in the Hamil- 


tonian (6-1), of a spin (say, that of o;) with other spins in the lattice as 
1 ig 
Fl ieeccptinn Op) = ~ aI vals 271 (6-18) 


where the sum over j extends only over the nearest neighbours of the spin 
o;, with v denoting the number of these nearest neighbours (the factor of 
‘ appears since each pair of spins is to be counted only once in arriving 
at the interaction energy), and where the expression within the brackets 
can be interpreted as representing the mean magnetic moment density 
with which the spin o; interacts. This mean magnetic moment density is 
a Statistical variable in the model that differs from site to site in any given 
configuration of the spins. The mean field theory is defined by ignoring 


this site-to-site fluctuation and replacing the expression + S o, with (we 


505 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


consider, to begin with, a finite lattice with N sites) 


eee i 
pe De (6-19) 
j i= 


where the summation on the right hand side extends over all the N sites. 


We thereby arrive at the mean field Ising Hamiltonian 


2N 


iJ 


eed a3 —h > 9%, (6-20a) 


where the summation is now for i,j ranging through all the lattice sites. 
This tells us that the Hamiltonian depends on the spins o; (i = 1,2,--- , N) 


only through the sum 


N 
1 
No = NW d. Oi; (6-20b) 


which we interpret as the magnetic moment density (or the magnetization) 
due to the spins (however, in the context of magnetism the actual mag- 
netization differs from the spin by a dimensional multiplicative constant). 


This greatly simplifies the analysis of the model, and we have 
H= -N(5uJm? + hme). (6-20c) 


6.2.3.2 Mean field theory: equilibrium states 


The equilibrium probability distribution over the set of all possible spin 
configurations {s} = {s1,52,--- ,sy} (recall that s; denotes the value of the 


random variable o; (i = 1,2,--- ,N); a spin configuration, corresponding 
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to spin values sj, s2,--- , sy is denoted by {s}) is then given by 
P({s}) = 5 exp [6 RID Tae ae d. s)l=s Fe NaS vJm? +hm)], 
(6-2 1a) 
where 
Z= S_ exp [8 Geet Dos + HD (6-2 1b) 


{s} 


and where m is the value taken up by the random variable m, in the 
configuration {s}. The energy of the configuration, looked at as a function 


of the parameter m is 
1 
E(m) = —N(svJm* +hmy), (6-22) 


and will be referred to as the energy function for the mean field Ising 
model. Note that the parameter m differs from the equilibrium value of 
the mean magnetization (this will be denoted below by the symbol m*), 
determined by the probability distribution (6-21a). It will be convenient 
to think of m and E(m) as the mean magnetization and the energy of an 


arbirarily specified configuration of the spins. 


In a similar vein, the entropy function of the mean field Ising model is 


defined as 


S(m) = —Nkp| In In |. (6-23) 
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This definition is motivated by the following consideration: imagine the 
spin system to have a specified magnetic moment m, and hence a given 
energy E(m) as in (6-22). Referring to the microcanonical ensemble cor- 
responding to this energy, the entropy of the spin system is given by 
—kglnW, where W is the number of ways that the N number of spin val- 
ues (51, 52,--- , Sy), each being either +1 or —1, can be distributed into two 
groups such that the mean + >, s; equals m. This gives 


N! 


N+mN|N=mN |" 
2 . 2 ‘ 


S=—kglnW = —kgln (6-24) 


For fixed m (# +1) and for large N, one arrives, by the application of 
Stirling’s formula, at (6-23). 


We now define the specific free energy function f(m) of the mean field Ising 


model as 


f(m) =< (E(m) — TS(m)) 


_ 1 9 petri: tem, lm : 
= (sv +hm))+ Bp "| : In oo In 5 |. (6-25) 


It turns out that this function can conveniently be made use of (refer 
to [41], chapter 2) in arriving at a number of relevant results pertaining 


to the mean field ising model. 


For instance, referring to the random variable m, corresponding to the 
magnetic moment per spin of the Ising system (refer to (6-20b)), the prob- 
ability distribution over the set of possible values of m,, represented by 


the distribution function to be denoted below by P(m) (where m stands 
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for a possible value of m,, —1 < m < +1), can be worked out from the 


basic probability distribution formula (6-21a), and is given by 


P(m) & eNO) Fon) (6-26) 


where m* denotes the mean magnetic moment (see (6-27) below) and 
where N has been assumed to be large, keeping in mind the thermo- 
dynamic limit (V — oo). The above equation is an asymptotic represen- 
tation for P(m) where additional terms, negligibly small corresponding to 
the right hand side for large NV, have been dropped, and constitutes an 
instance of the large deviation principle, where I(m) = f(m)— f(m*) stands 


for the rate function pertaining to the distribution. 


The equilibrium state of the Ising system in the mean field theory corre- 
sponds to a mean magnetic moment (m* = (m,)) that can be obtained in 
the thermodynamic limit by minimizing the specific free energy function 
f(m) (this corresponds to the minimum principle for the free energy of a 
thermodynamic system specified by a given value of the temperature T 
in the classical context; refer to section 2.1.5.3 where the principle was 
explained in the quantum context). On evaluating the derivative o£ and 
equating it to zero (note that, while the possible values of m, are discretely 
distributed for any finite NV, one obtains a continuous variable m in the 
thermodynamic limit), one obtains the equation determining the condi- 


tion of stationarity of the free energy function that has to be necessarily 


satisfied by m* (in addition, the stationary value has to correspond to a 
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global minimum of f(m)): 


m* = tanh (G(vJm* + h)). (6-27) 


We first consider the case h = 0, i.e., the zero magnetic field behaviour of 


the mean field Ising model, when (6-27)reduces to 


m* = tanh(GvJm*). (6-28) 


The solution to this implicit equation can be obtained graphically, as 
illustrated in fig. 6-1. One finds that there exists a unique solution for 


m*, namely, m* = 0, when 6 < £,, with 6. given by 


po (6-29) 


(see fig. 6-1(A)), which corresponds to absence of spontaneous magneti- 
zation in the mdel. On the other hand, for 3 > {., there exist three distinct 
solutions, of which (m* = 0) actually corresponds to a local maximum of 
f(m) rather than a minimum, and thus, does not correspond to an equi- 
librium state. The other two solutions (of equal magnitude, see fig. 6-1(B)) 
then represent two possible equilibrium solutions for the magnetic moment 
in the absence of an external field h and therefore correspond to sponta- 
neous magnetization in the model. In other words, the mean field Ising 
model admits of a phase transition at a temperature T = T, = kg'8>1 = = 
with two distinct equilibrium states at any temperature T < T, (referred to 


as Gibbs states for an infinitely extended system) for which the magnetic 


moments (+m*) are equal in magnitude but of opposite signs. 
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J 


(B) 


Figure 6-1: Graphical solution to the implicit equation (6-28); graphs are drawn for 
the functions y(m) = m and y(m) = tanh(6vJm); the points of intersection of the two 
graphs correspond to possible solutions; (A) 6 < 8. = =4; the only point of intersection is 
at the origin, giving m* = 0, which corresponds to a minimum of the specific free energy 
function; for such a value of 6 there is no spontaneous magnetization; (B)3 > 6. = a: 
there occur three points of intersection, of which the one at the origin corresponds to 
a maximum of the specific free energy function, while the other two correspond to two 
equilibrium states with equal and opposite values (+m*) of the magnetization density; 
this is indicative of a phase transition at 6 = fc. 


At any temperature T' > T., the probability distribution for the mean mag- 


netization, given by the function P(m) of (6-26) is, for any large but finite 
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value of N almost entirely concentrated at m = 0 (P(m) is exponentially 
small for m ¥ 0). For T < T., on the other hand P(m) is concentrated close 
to m = +m*, being exponentially small for m away from either of these two 


values. 


Indeed, the two Gibbs states for the infinitely extended system are described 
by two distinct probability distributions over possible spin configurations 
and correspond to pure phases. We will pursue the question of possible 
phase configurations of infinitely extended spin systems in a later section of 


the present chapter (sec. 6.2.5). 


In summary, for the zero field cas (h = 0), there is a unique equilibrium 
state with m* = 0 when  < £, (T > T.), while there occurs a bifurcation at 
B., there being two equilibrium states for any { > 6., with m* = +m*({), 
where m*(G) — 0 as 6 approaches {, from above. This is illustrated 
by graph marked ‘A’ in fig. 6-2 (the figure additionally includes graphs 
marked ‘B’ and ‘C’ depicting the variation of the equilibrium magnetiza- 
tion with temperature for each of two non-zero values of h, see below). 
Such a phase transition is referred to as a continuous one, or a phase 


transition of the second kind. 


Turning now to the case h ¥ 0 (mean field Ising model in a magnetic 
field), the graphical solution (see figures 6-3(A), 6-3(B)) of eq. (6-27) shows 
further interesting features. For T > T, there exists a unique solution 
m* = m*(h; 3) (6-3(A)) that varies smoothly with h, passing through 0 as h 
is made to pass through zero value. However, the graph of m*(h; 6) against 


h (for a fixed 8 < 6.) becomes more and more steep near h = 0 (i.e., for a 
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m* 


Figure 6-2: Variation of equilirium magnetic moment density with temperature T = 
(kp) at specified values of h; the graph marked ‘A’ depicts the variation for h = 0, 
where there exists a single equilibrium value (m* = 0) for all T > T, = a and two equal 
and opposite values (+m*) for any T < T,; thus, the variation of m* with T is non- 
smooth at T = T,; the graphs marked ‘B’ and ‘C’ are for two equal and opposite values 
of h where, for each fixed h, there corresponds a single branch for the equilibrium value 
of the magnetic moment density (in contrast to two branches that appear for the zero 
field case in the temperature range 0 < T < T,) the latter being a smooth function of 


temperature in the entire range 0 < T' < co. 


small variation in h away from 0, m* varies to a relatively large extent; see 
fig. 6-4(A), where the variation is shown for two chosen values of {). For 
T < T,, on the other hand, one obtains three solutions to (6-27) for any 
given non-zero value of h, of which only one (point P in fig. 6-3(B), which 
is drawn for a positive value of h) corresponds to a global minimum of 
f(m) considered as a function of m, and hence represents the equilibrium 


solution in the model (among the other two, represented by points Q,R 
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in fig. 6-3(B), one corresponds to a maximum and the other to a local 


mnimum, but not to a global one, of f(m)). 


When one considers the variation of m*(h; 3), the equilibrium magnetiza- 
tion, with the field strength h for a fixed 6 > 6., one encounters a discon- 
tinuity across h = 0 (fig. 6-4(B); recall the occurrence of two equilibrium 
states for any specified value of (6 larger than ', for the zero field case. 
This discontinuity is indicative of a phase transition of the first kind at 
any temperature T < T.. Fig. 6-2 depicts the variation of m*(h; 3) with 
6 for two fixed non-zero values of h (the two being equal and opposite; 
in addition, the figure shows the variation of m* with 6 for h = 0, as ex- 
plained above) where one finds a smooth variation, in contrast to the loss 


of smoothness at 6 = 6, for h = 0. 


In summary, the mean field Ising model exhibits a rich structure in the to- 
tality of equilibrium states when considered in terms of their dependence 
on the two parameters h and 3: one encounters a critical temperature T, 
(corresponding to 8 = 6. = >) at which there occurs a phase transition 
of the second kind for h = 0 while, at any temperature less than T, there 
occure a phase transition of the first kind as h is made to vary across the 


value h = 0 at a fixed temperature. 


6.2.3.3 Mean field theory: alternative formulations 


There exist alternative but equivalent formulations of the mean field Ising 


model. 
The Weiss theory of ferromagnetism is based on the assumption of an 
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Bi 


y= tanh pi 
B 
m 
B<B A>0 
(A) 
¥ 
y= tanh pi 


(B) 


Figure 6-3: Graphical solution to the implicit equation (6-27) for specified values 
of 8,h; graphs are drawn for the functions y(m) = m and y(m) = tanh (6(vJm + h)); 
the points of intersection of the two graphs correspond to possible solutions; (A) for 
B< 6. = 4, h> 0, there is only a single point of intersection, which corresponds to 
a minimum of the specific free energy function for the specified 6,h; (B)@ > 8; h is 
chosen to be positive; there occur three points of intersection, of which the one marked 
‘P’ corresponds to a global minimum of the specific free energy function f(m); the one 
marked ‘Q’ corresponds to a maximum of f(m), while that marked ‘R’ corresponds to a 
local minimum but not a global one; an analogous situation obtains when h is chosen 
negative. 


effective internal field at the site of each spin, produced by all the other 


spins with which it interacts. This implies that all the spins in the lat- 
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m* 


B, 


B,<B,<B, 


(A) 


(B) 


Figure 6-4: Depicting the variation of the equilibrium magnetization m* with the field 
strength h for a specified value of 3; (A) 8 < 8.; the variation of m* with h is smooth; the 
slope at the origin rises as 6 is made to approach (, from below; the graphs correspond 
to two distinct values (31, G2) of 8, where 6, < 82 < 6; (B) 6 > 2; here the variation of m* 
with hf involves a discontinuity at h = 0, which is indicative of a phase transition of the 
first kind. 
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tice are equivalent, and the internal field is then assumed to be propor- 
tional to the mean magnetic moment of each spin (where we disregard 
the proportionality constant). In other words, with an appropriate choice 


of units, the Hamiltonian is written as 


N 
=-S'-( (h Se Pgesrnat) )oi, (6-30a) 
11. 
where 
ET ences = Jum, (6-30b) 
with 
m= (oi) (¢=1,2,--- ,N), (6-30c) 


the right hand side of the last equality being actually independent of the 
index i. Thus, the probability distribution of any arbitrarily chosen spin 


o; (i= 1,2,--- , N) is given by 
PiU o}) = er pues (CS 12s SN) (6-3 1a) 
A a 5 a 45 ’ ) 


where we gloss over the distinction between a random variable and any of 
its possible values (e.g., the spin variable o; and any of its possible values 


(+1), s,;), and where 


ve Sues 18 sJum + h)o] G12), (6-3 1b) 


{o} j=l 
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One can then work out the mean (o;) for any chosen i(= 1,2,---N) and, 
for consistency, equate it to m (refer to (6-30c)). One obtains, by referring 


to the spin o; for the sake of concreteness, 


oats 71 explB(Jum + hoi] T1j-2 oj a1 xP[B(Jvm + h)oy] 
Nee ae exp[6(Jum + h)o5] 


m= 


=tanh (3(vJm + h)), (6-32) 


which is precisely the equation (6-27) obtained earlier in a slightly differ - 


ent notation. 


More generally, a mean field theory involves a variational approach based 
on the Bogoliubov inequality (see [15], chapter 20) and leads, in the case of 
the nearest neighbour Ising model, to the same result as obtained above 
( [146], chapter 4, [99], chapter 7). Other approaches to the mean field 


Ising model can be found in [41], chapter2. 


The basic idea underlying the Bogoliubov inequality is as follows: alongside 
the system under consideration (with Hamiltonian H; this is referred to 
as the ‘perturbed system’), one considers an appropriate reference system 
with Hamiltonian, say Ho (the ‘unperturbed system’) for which the partition 
function Zp) (and hence the free energy fy, along with the mean value of any 
given function defined over the set of microscopic configurations) is known. 


The Bogoliubov inequality then reads 


Bs Bot Ca = Hg a, (6-33) 


where F is the free energy of the system, which one is required to work out, 
and (---)o denotes the average value referred to the equilibrium ensemble 


of the reference (or ‘unperturbed’) system, worked out with the help of the 
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known partition function Zo. 


The mean field free energy is then obtained by identifying an appropriate 
set of parameters \ in the problem (these may be parameters involved in 
the definition of H or Ho, and the set may be made up of just one single 


parameter), and then minimizing the function 


(A) = Fo + (H — Ao)o, (6-34) 


with respect to these parameters. This constitutes the variational approach 
to the mean field theory - one of general applicability for diverse systems of 


interest. 


In the case of the Ising model one can take Hp as a spin system without 


interactions, in a magnetic field of strength A 
Ho => AS % ai. (6-35) 


Incidentally, the variational approach is of general applicability and is of 


great value for classical and quantum systems alike. 


The expression for the mean field free energy obtained, for instance, by the 
minimization of the variational free energy (formula (6-34) with Ho given 


by (6-35)), is then found to be ( [99]) 


Jv 


Fimean—feld] — y[g~" In (2cosh(Bh + BJvm")) + =m") (6-36) 


where, for any given value of the field parameter h, m* is given by the stable 
solution satisfying (6-27). Making use of this value of m*, one obtains F' as 
a function of T and h (in addition to the system size, which is made to go to 


infinity in the end), the two natural variables for the free energy. One can 
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make a Legendre transformation 


F=F+Nm*h, (6-37) 


so as to obtain the free energy as a function of the temperature and the 
magnetization. The relation between m* and h is recovered by taking the 


derivative of F with respect to m*. 


6.2.3.4 Mean field theory: critical exponents 


As mentioned above, a phase transition of the second kind occurs across 
the state m* = 0 at 6 = £, (eq. (6-29)) for h = 0, which constitutes the 
critical state of the mean field Ising model. As { is made to vary across /,, 
m* varies continuously, but is not an analytic function of 3. This lack of 
analyticity is shared by various other state functions in the model, and 
one can define a set of critical exponents describing the nature of variation 


of these functions as 8 approaches (, either from below or from above. 


For instance, the variation of m* with T for 6 greater than but close to 5, 


is of the form 


m(T) ~ (T,-T)2. (6-38a) 


This is seen by putting T = T.-—€ (€ > 0) fie, B= 84+ Be in (6-28), ex- 
panding in ¢, and retaining only the leading term in the expansion, thereby 
obtaining (6-38a) (check this out; there appear three solutions for m*, of 


which one, namely, m* = 0 is discarded on noting that it corresponds to a 
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maximum of the specific free energy function while, among the other two, 


the positive root is chosen for the sake of concreteness). 


This is of the form 


m*(T) rae (T, —T)* (with 6’ = 5) (6-38b) 
where (’ is referred to as a critical exponent. The above result is then 
expressed by saying that the mean field value of the critical exponent /’ is 
5 (the critical exponent for 6 — 6- is zero, which is indicative of the lack of 
analyticity mentioned above; we make use of the symbol (’ instead of the 


more commonly used symbol ( for the critical exponent since the latter 


is already in use with a different connotation). 


Other critical exponents in the mean field approximation can be obtained 
analogously. For instance, defining the mean specific energy e(7,h) as 
e(T, h) = limy_,.0. + (H), with the mean field Hamiltonian H given by (6-20a), 


and the specific heat as 


c(T) = a (6-39) 


one can work out the mean field values of the critical exponents a, a’ 


defined by means of formulas (we use variables 8 = (kgT)~' and 8. = 6 = 
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Cc 


O B. B 


Figure 6-5: Depicting the discontinuous variation, with temperature, of the specific 
heat per site (c) at zero field in the mean field Ising model; in spite of the discontinuity 
at 8 = £., the critical exponents a,a’, as defined by means of(6-40a), are both zero. 


(kgT.)~! in the formulas below) 


bBo 


6) =p), (6-40a) 
BBe 


where it turns out that the said values are 
a= 0,.a' =0, (6-40b) 


Though both the two exponents a,a’ are zero, the specific heat shows a 


discontinuity at 6 = @., as shown in fig. 6-5. 


Along with the non-analyticity of m*(6,h) (the equilibrium magnetization 
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at any specified 6,h) at 6 = 6. for h = 0, there occurs a singularity of the 


susceptibility 


Om* 


x(8) = (ay) a0" (6-4 1a) 


as 6 approaches £3. from below (i.e., T approaches T, from above) because 
of the spontaneous magnetization appearing at 3.. The susceptibility di- 


verges at 6 = 8. as 


NB) oe e= Br, (6-4 1b) 
BBe 


where the mean field value of the critical exponent y works out to 
eel be (6-4 1c) 


Finally, from the variation of the magnetization with h at h = 0, one can 


define a crtical exponent 5 by means of 


m*(B = Boh) ~ hi, (6-42a) 


h-0 


where one obtains, for the mean field Ising model 
6 =3. (6-42b) 


While the values of the critaical exponents stated above are all for the 
mean field Ising model, the definitions of the exponents themselves, as 
given above, make sense for more general magnetic models admitting of 


continuous phase transitions. Based on the values of these exponents, 
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various systems can be classified into universality classes where models 
belonging to any particular universality class are characterized by the 
same values of the exponents, regardless of the physical systems they 


represent and of their Hamiltonian functions. 


The critical exponents obtained above in the mean field Ising model dif- 
fer markedly from those obtained in more realistic estimations for several 
classes of models, especially from exact results obtained in the 2-dim 
Ising model. Significantly, however, the quality of the mean field results 
depend on the dimension of the system under consideration. In partic- 
ular, the mean field critical exponents for the Ising model obtained in 
the present section become exact for dimension D > 4, where the special 
value D = 4 is referred to as the upper critical dimension for the model. In 
our present discussion, the dimension of the lattice under consideration 
enters into the theory through v, the number of nearest neighbours of 
any specified lattice point (for instance, v = 2D for a cubic lattice in D 
dimensions), and our results imply that the mean field critical exponents 
are, in fact, independent of D (the theory does not apply for D = 1 since 


the 1-dim Ising model does not admit of a phase transition). 


Along with the upper critical dimension, a model of an interacting system 
is characterized by a lower critical dimension as well, below which it is 
characterized by the absence of a phase transition. as the dimension is 
lowered, the connectivity between the constituents of a system decreases, 
as a result of which the co-operative effects of the interaction cannot get 


the better of fluctuations. The lower critical dimension of the Ising model 
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is bounded below by the value D = 1 - recall from sec. 5.7.3 that phase 


transitions are ruled out in 1-dim systems with short-range interactions. 


Continuous phase transitions are fruitfully studied by making use of 
ideas in scaling and renormalization group theory, closely related to sym- 


metry considerations. These will be taken up briefly in sec. 6.4. 


6.2.3.5 Mean field theory: limitations 


As mentioned above, the mean field theory predicts wrong values for the 


critical exponents for D < 4 and has basic limitations built into it. 


In particular, the mean field theory goes wrong in respect of the spin cor- 
relations since it effectively ignores the correlations between any two dis- 
tinct spins, say, o;,0; (i 4 j). This can be checked by referring to (6-2 1a) 


from which one obtains, in the thermodynamic limit (NV — ov), 


(0105) = (0%) (03) (( # J). (6-43) 


Indeed, as mentioned above, the probability distribution over possible 
values of the mean magnetization, which also gives the probability distri- 
bution for o; regardless of the index i (reason this out; all the spins are 
completely equivalent in the mean field theory), is concentrated at the 
equilibrium value m* (recall that m* = 0 for 6 < 6., and can have any of 
two possible values for 6 > 6.; the probability distribution reduces to a 
delta function in the thermodynamic limit), which explains (6-43) (a result 
that can also be obtained in any of the alternative approaches mentioned 


in 6.2.3.3). However, the above formula lacks consistency since it fails 
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for i = j because (ca?) = 1, while (0;)? = m*?, for all i. 


In other words, the mean field theory does not correctly account for fluc- 
tuations and spin correlations, notwithstanding the fact that it does pre- 
dict a phase transition and that the critical exponents derived in the the- 
ory become exact for D > 4. The upper critical dimension marks the 
borderline across which the scaling properties characterizing a system 
undergo a fundamental change (refer to sec. 6.3.4 for further elabora- 


tions). 


The limitation of the mean field theory also shows up in the Gibbs states 
it predicts below the critical temperature (T < T.; the term ‘Gibbs state’ is 
commonly used to refer to equilibrium states of infinitely extended sys- 
tems). As we have seen, there appear two such states that have been 
termed pure phases in sec. 6.2.3.2. However, the theory of infinitely ex- 
tended systems tells us that there exist, in general, equilibrium solutions 
corresponding to mixed phases and coexisting phases apart from and in 
addition to the pure phases (coexisting phases appear in the Ising model 
only for D > 2). In other words, the mean field theory yields only a limited 


set of equilibrium solutions in the thermodynamic limit. 


6.2.4 The 2-dim Ising model 


The 2-dim Ising model is very special in that it admits of a phase tran- 
sition (as does the Ising model in any dimension D > 2) and, in contrast 
to models in D > 2, possesses an exact solution (for mgnetic field h = 0). 


That a phase transition exists in the 2-dim Ising model was established 
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by Peierls by employng a combinatorial argument, and the exact value 
of the critical temperature T. was suggested by Kramers and Wannier, 
their result being subsequently confirmed by the exact results obtained 


by Onasager. 


In this section, we will first enumerate a number of boundary conditions 
relevant to the issue of phase transitions in the Ising model, and will 
then have a brief look at the combinatorial argument of Peierls as also at 
the duality argument of Kramers and Wannier. This will be followed by 
an introduction to the basic idea underlying the transfer matrix method 


leading to the exact solution obtained by Onsager. 


In a subsequent section (sec.6.2.5) I will present a number of ideas and 
statements (again, without proof) relating to Gibbs states for the infinitely 
extended Ising lattice — ones of basic relevance to the issue of phase 


transitions in statistical mechanics. 


In the following, we will consider a 2-dim planar square lattice (see fig- 
ures 6-6, 6-7 below) with adjacent lattice points separated by a unit dis- 
tance (for the sake of concreteness), to which most of the results will ap- 
ply. The spin located at any lattice point will be assumed to interact with 
its (v =)4 nearest neighbours, where the interaction between any pair of 
nearest neighbours will be said to correspond to a bond. The individual 
sites will be denoted by indices i,j etc., and nearest neighbour pairs by 
symbols such as (ij), where i,7 will vary over a finite or an infinite set 
depending on the context. Each index like i or j will actually stand for 


a pair of integers, gving the co-ordinates of a site with reference to some 
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appropriately chosen site as origin, and with co-ordinate axes parallel to 
the lines joining a site with its nearest neighbours. The two co-ordinates 
can be looked upon as a row index and a column index with the rows 


being the horizontal lattice lines and the columns being the vertical ones. 


For the sake of generality, one may consider a rectangular array of lattice 
sites with unequal numbers of rows and columns (with the numbers grow- 
ing through a succession of stages toward the thermodynamic limit) and, 
moreover, assume that the coupling constants along the horizontal and ver- 
tical directions differ from each other, still obtaining exact results. However, 
in the following, we will, for the sake of simplicity, assume that the two 


coupling constants are equal with J as their common value. 


6.2.4.1 Boundary conditions in the Ising model 


Though the extensive thermodynamic variables of a system are indepen- 
dent of the boundary conditions used in describing the system in terms 
of a model in statistical mechanics (as we have seen in chapter 5, this 
is what one means by saying that the thermodynamic limit exists), the 
state of the system, as described by the probability distribution over the 
relevant microscopic configurations (such as the possible configurations 
of spins in an Ising lattice), may depend on the boundary conditions, and 


this dependence continues to remain in the thermodynamic limit. 


For instance, considering a 2-dim Ising lattice with N spins, one can 
evaluate the probability distribution over possible configurations of the 


spins under a specified set of boundary conditions (we call it 7,; several 
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instances of boundary conditions will be enumerated below) and work 
out a number of microscopic features like, say, the mean spin at any 
given site (say, the one at a pre-assigned origin). The thermodynamic 
limit NV > co can then be taken (maintaining, in some appropriate sense, 
the boundary conditions 7), arriving at the value of the mean spin at 
the origin in this limit, which we call (a), One can equally well carry 
out the same exercise while making use of a different set of boundary 
conditions, say, 2. The value (o) so obtained will be seen, in general, 
not to agree with the value (a) obtained under 7. This sensitivity to 
boundary conditions is a special feature of models admitting of phase 


transitions where characteristic quantities such as the order parameter 


usually depend on the boundary conditions. 


Fig. 6-6(A), (B), depict two commonly invoked boundary conditions where, 
in each of these, a finite lattice with 6 rows and 6 columns (i.e., with 
N = 36 spins) is shown, with the boundry spins marked as filled circles 
in (A), the rest of spins (the ‘interior’ ones) being marked as open circles 


(recall that each spin can take up a value of either +1 or -1). 


In fig. 6-6(A), the boundary spins are assumed to interact only with their 
nearest neighbours in the interior, and not with spins that may be imag- 
ined to be located at sites beyond the boundary (such sites are not shown 
in the figure), thereby defining the free or open boundary condition (re- 
ferred to by the symbol 77). In (B), the spins in the sixth column interact 
with their nearest neighbours to the left (i.e., with those in the fifth col- 
umn) and are also assumed to interact with those on the first column, 


a copy of which may be imagined to be placed to the right of the sixth 
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Figure 6-6: Boundary conditions for a 2-dim Ising lattice; a finite lattice made up of 
6 rows and 6 columns (number of lattice points N = 36) is shown, with the boundary 
lattice sites marked with filled circles in (A), the rest of the lattice sites being marked 
with open circles; (A) corresponds to the free or open boundary condition (to be referred 
to as nr); spins at the boundary sites interact only with their nearest neighbors on 
one side in the ‘interior’ (open circles), there being no interaction with any spin (not 
shown) that may be located beyond the boundary; (B) periodic boundary condition (7p); 
boundary spins on the sixth column are assumed to interact with those on the first 
column (in addition to their interaction with spins in the fifth column), while similar 
interactions are assumed for the other boundary spins; it is useful to imagine the lattice 
to be wrapped around in such a manner that the first and the sixth columns, as also 
the first and the sixth rows, become adjacent to each other; the dotted square marks 
the boundary rows and columns; the interactions between spins on boundary rows as 
also between those on boundary columns are depicted symbolically with bent arrows; 
in (A) and (B), the bonds (not shown) between nearest neighbors may be ‘low energy’ or 
‘high energy’ ones. 


column; and similarly, the spins in the first column interact with those 
in the second located to the right and also with those of the sixth, a copy 
of which may be imagined to be placed to the left of the first column. 
Similarly, corresponding elements in the first and the sixth rows are also 
assumed to interact as nearest neighbours. This defines the periodic 
boundary condition (np) where one can imagine the lattice to be wrapped 
around in such a way that the first and the sixth columns become adja- 


cent to each other, as also the first and the sixth rows. 
Similarly, fig. 6-7(A), (B) show instances of a finite 2-dim Ising lattice 
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(A) (B) 
Figure 6-7: Further instances of boundary conditions for a finite 2-dim Ising lat- 
tice; (A) the ‘+’ boundary condition (71); spins at the boundary sites interact with their 
nearest neighbours in the ‘interior’ (open circles; a 6 x 6 array is shown) as also with 
additional spins imagined to be placed at ‘exterior’ sites (only the nearest neighbours to 
the boundary sites are shown), all of which are assumed to be frozen in the ‘+’ state; (B) 


the ‘-’ boundary condition (7_); the exterior spins are all assumed to be frozen in the ‘-’ 
state. 


(6 x 6, N = 36) with additional rows and columns of spins placed beyond 
the boundary spins, where these additional spins may be looked upon as 
ones belonging to an infinitely extending set of lattice points (not shown). 
In (A), all these additional spins are assumed to be frozen in the ‘+’ state 
(c = +1) while in (B) these are all in the ‘-’ state (o = —1). These are said 
to define the ‘+’ and ‘-’ boundary conditions respectvely (to be referred to 


as 7, and 7_). 


Fig. 6-8 depicts a boundary condition of a different type (an inhomoge- 
neous one) where the upper half of the spins beyond the boundary ones 
(filled circles) are frozen in the + state while those in the lower half are 
all in the — state (only the nearest neighbours to the boundary sites are 
shown). This is referred to as the Dobrushin boundary condition (np), 


and is useful in the context of the phase coexistence problem. The ‘+’ and 
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Figure 6-8: Depicting the Dobrushin boundary condition for a finite ising lattice; 
spins are shown at sites beyond the boundary ones, where the spins in the upper half 
of the diagram are assumed to be frozen in the ‘+’ state and those in the lower half in 
the ‘-’ state; the ‘+’and ‘-’ spins may be interchanged, resulting in a different boundary 
condition of the Dobrushin type. 


the ‘-’ spins beyond the boundary may be interchanged, resulting in a 


different, though analogous, boundary condition of the Dobrushin type. 


The boundary conditions analogous to those indicated above may be de- 


fined for Ising lattices of dimension D > 2 as well. 


Generally speaking, a boundary condition can be defined for any finite 


set of lattice points (such a finite set with, say, N elements, will be de- 
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noted by the symbol A) by imagining an infinitely extended lattice (not 
shown in figures above) of spins to which A belongs. The configuration 
of spins within A will be treated as variable (there are 2 such config- 
urations) while the spins in the rest of the infinite lattice (exterior to A) 
will be assumed to be frozen in some given configuration, say, 7, where 
7 will define the boundary condition on the spins in A, as the latter is 
made to expand in size through a succession of stages so as to go over to 
the thermodynamic limit (the part of the lattice exterior to A is made to 
undergo a corresponding series of expansions). Such a representation is 


useful for the boundary conditions 7,,7_, and 7p defined above. 


6.2.4.2 High temperature and low temperature representations: closed 


graphs and contours 


The bonds between nearest neighbours can be of two types: the low en- 
ergy ones (those between spins both of which are either in the ‘+’ or in 
the ‘-’ state, i.e., the ‘++’ and the ~’ bonds), each making a contribution 
—J to the energy of any specified spin configuration, and the high energy 
ones (‘+-’ bonds), each making a contribution +. to the energy of a spin 


configuration. 


There exist two complementary approaches for the evaluation of the par- 
tition function in the form of a series expansion, both involving combina- 
torial considerations. Of these, one is suitable for the high temperature 
regime and the other for the low temperature regime, while a relation of 


duality obtains between the two, as explained below. 


A. The high temperature representation: closed graphs 
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The partition function for a finite 2-dim Ising system of NV spins at tem- 


perature T = (kpG)~' is given, for the zero field case (h = 0), by 


Z= S "exp a> 5:85], (6-44) 
{s} (23) 
where s; (i = 1,2,--- ,N) is the value (+1 or —1) of the spin variable o; in 


any given spin configuration, and {s} represents a typical configuration. 


Since s? = 1 for each 7 regardless of the configuration, one has the identity 
e?788) = cosh(8J) + sinh(6J)s;s; = cosh K(1+ s,s; tanh K) (K = BJ). (6-45) 
The partition function can then be expressed in the form 


Z = (cosh K)? > [a + s,s; tanh Kk), (6-46) 
{s} (J) 

where B denotes the number of bonds in the finite lattice A under con- 
sideration for which the default will be a square array A with N sites (for 
the sake of concreteness we consider an array A, with opposite corners at 
(—n,—n) and (n,n) with reference to a co-ordinate system with axes par- 
allel to the lines connecting adjacent lattice sites (n = 1,2,---)). We leave 
the boundary condition as implied for the time being. The sum on the 
right hand side of (6-46) can be expanded in the form of a sum of terms 


of the form 
S, = (tanh K)'(si, 81 )(5i.8%,) --- (8i,5i1); (6-47) 
where, for any given /(= 1, 2,---), each of the the index pairs {71,7}, --- {#, 7} 
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correspond to nearest neighbour sites, each nearest neighbour pair in the 
product occurring only once, while an index such as i, or 7), (k = 1,--- ,l) 
can occur more than once. However, the summation over spin configura- 
tions in (6-46) ensures that each index occurs an even number of times 
since an odd number of occurrences implies a zero contribution in the 


sum (since }>,_,, s* = 0 when k is odd). 


It can thus be seen that a term such as the one in (6-47) corresponds to a 
closed graph on the Ising lattice whose edges (/ in number in the present 
instance) are made up of bonds (i.e., lines joining nearest neighbour sites) 
where the edges can intersect at lattice sites and where each site of the 
finite lattice can be a meeting point of 0,2, or 4 edges, of which the first 
possibility means that the site referred to is not on the graph under con- 
sideration (reason this out). Fig. 6-9 depicts an instance of such a graph 
with | = 8 where each of six vertices (i.e., lattice sites on the graph) are 
the meeting points of two edges, and one vertex is a meeting point of four 
edges. There can be more than one graphs with any given number of 
edges (for instance, a second example of a graph with eight edges is one 


with four sides, each made up two edges). 


If the symbol G denotes a closed graph of the type indicated above, and 
l(G) be the number of edges (i.e., their total length, since each edge is as- 
sumed to be of length unity) then the partition function can be expressed 


in the form 


Z= N (cosh K) B PS (tanh Kk’)! (6-48) 
G 
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Figure 6-9: Depicting a closed graph in a square array for the 2-dim Ising model; the 
graph is made up of vertices (namely, the lattice sites at which the spins are located) 
and edges (namely, the bonds between nearest neighbor lattice sites), where an even 
number of edges (2 or 4) meet at each vertex (for a lattice site not located on a closed 
graph, the number of edges meeting at the site is 0); such closed graphs are relevant for 
the high temperature representation of the partition fanction given by (6-48); a similar 
representation holds for dimension D > 2 as well. 


where the factor of 2% appears on summing over all possible spin config- 


urations, since each lattice site contributes a factor of two (>, ,,s* = 2 


when sk; = 0,2, or 4), and the summation on the right hand side is over all 


possible graphs G of the type indicated above (refer to [66]). 


This constitutes a useful high temperature representation of the Ising 


model that holds in dimensions D > 2 (though we have considered the 
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case h = 0, the representation holds for h 4 0 as well) and can be made use 
of in estimating mean values of relevance in the high temperature regime 
(3, kK small for any given coupling strength J, tanh kK ~ kK), where one 
finds that the system admits of a unique equilibrium state. An important 
result that can be deduced along the above line of reasoning relates to 
the mean value of the spin at any lattice site sufficiently removed from 
the boundary. Before stating the result for this important quantity, one 
needs to know the effect of boundary conditions on the right hand side 
of (6-48). 


Both B and the set of graphs G depend on the boundary conditions on the 
finite lattice A, which we assume (by default) to be a square array with 
two opposite corners having co-ordinates, say (—n,—n), (n,n) (in the case 
of a 2-dim lattice) for some positive integer n (the thermodynamic limit 
then corresponds to n — oo). For instance, in the case of the ‘+’ boundary 
condition (7,, see fig. 6-7(A)), one has B = B+) = 2N for the 2-dim lattice, 
while the dependence of {G} (we denote a typical graph belonging to this 
set by G“) for the boundary condition under consideration) is much more 
complex. The case of the free boundary condition will be considered below 


with reference to the dual lattice. 


Based on all this, one can now state a number of results on (a))‘+), the 
mean value of the spin located at the origin (chosen as the site at the 
centre of the square array A,, referred to above, i.e., of the finite lattice 
under consideration; recall that, in our notation, a finte set of sites in an 
infinitely extended lattice is generally denoted by the symbol A which, in 


the present instance, referes to the array A,,). 
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This mean value is of crucial relevance in the context of a phase transition 
since the criterion for a phase transition is that, in the thermodynamic 
limit, (a9) ¥ 0 (likewise, (a9) 4 0, where (a9) corresponds to the ‘-’ 


boundary condition, i.e., to 7_, see fig. 6-7(B)). Put differently, the depen- 


dence of (a9) on the boundary condition in the case of a finite lattice 
is carried over to the thermodynamic limit for a temperature at which 
a phase transition takes plce (recall that the field strength h has been 


assumed to be 0 in the present context). 


Issues relating to phase transition will be taken up at greater length below 


(sec. 6.2.5). 


Based on the analysis leading to the high temperature representation (6-48), 


one can obtain a bound for (o9)‘*) that applies for sufficiently small val- 
ues of 3 (see [41] for details). In particular, for a 2-dim square array A,, 


with corners at (—n,—n) and (n,n), one obtains the bound 


(ap) < em, (6-49) 


with c(Z) > 0 for all 6 < (for a D-dim lattice, this bound on ( generalizes 


a 
16J 


to 8 < zs). In other words, (o9)‘*) > 0 in the thermodynamic limit, 
implying that the 2-dim Ising model does not admit of a phase transition 
at sufficiently high temperatures, and that the state of the system at 
any specified temperature is unique in this regime (recall that the field 
strength h has been assumed to be zero; however, the state turns out to 
be unique for any specified non-zero value of h at all temperatures). As 


we will seee in the course of subsequent discussions, this result holds 
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for the Gibbs state of the system defined for an infinitely extended lattice 
without reference to a finite lattice of N spins, where N is made to tend 


to oo. 


B. Low temperature representation: contours in the dual lattice 


The dual lattice for a planar square lattice is constituted by the centres 
of its unit cells and is itself a square lattice. The lines joining nearest 
neighbours in the dual lattice are its edges that run parallel to the bonds 
of the direct lattice. This is shown in fig. 6-10 where the square ABCD is 
a unit cell of the direct lattice, while its centre P is a lattice point in the 
dual lattice. In the figure, the circles are the lattice points of the direct 


lattice, while the crosses mark those of the dual lattice. 


The edges in the dual lattice are perpendicular bisectors of those of the 
direct lattice i.e., of the lines joining the nearest neighbours of the latter. 
For a given spin configuration in the direct lattice, a set of edges of the 
dual lattice acquire special relevance, namely, those that bisect the lines 
joining opposite spins, i.e., spins with values +1 and —1.These edges of 
the dual lattice separating opposite spins make up a set of contours that 
may contain several disjoint parts, some of which may be open while 
the rest are closed polygons that may be self-intersecting at the vertices 
(an open contour is possible, for instance, in the case of the Dobrushin 
boundary condition). The disposition of contours depends on the spin 
configuration in the direct lattice as also on the boundary conditions. In 
particular, in the case of the ‘+’ and the ‘-’ boundary conditions (7, and 7_ 


in our nomenclature) all the contours are closed (see [46], [41] for further 
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Figure 6-10: Depicting a finite Ising lattice in the form of a square array, and its dual 
lattice; sites external to the array are not shown; sites of the direct lattice are marked 
with circles while the cross marks indicate those of the dual lattice; thus, the square 
ABCD has direct lattice sites at its corners while its centre P is a dual lattice site; dotted 
lines are the edges of the dual lattice and are perpendicular bisectors of edges of the 
direct lattice that have nearest neighbor sites at their end-points. 


O 
O 


details). 


Since an edge (of unit length) of a contour in the dual lattice separates 
every nearest neighbour pair of opposite spins in the direct lattice, the 


energy of a spin configuration can be expressed in the form 


= =B87 + 2G, (6-50a) 
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where L stands for the total length of the (composite) contour and B de- 
notes, as before, the total number of bonds (including those involving the 
external spins, if any, that may be implied by the boundary condition). 
The first term in the above expression gives the energy on the supposi- 
tion that all the bonds are between like spins (each bond having an energy 
—J), while the second term gives the correction due to 2 number of bonds 
between opposite spins (since each high energy bond gives a correction of 


2J over —J). 


In the case of the boundary conditions 7, and 7_, the correspondence be- 
tween a given disposition of the contour in the dual lattice and the spin con- 
figuration in the direct lattice that produces this disposition is one-to-one, 
while in the case of the periodic boundary condition and the free boundary 
condition the correspondence is one-to-two because of the symmetry under 


a global spin flip in the direct lattice. 


Considering the boundary condition 7; for the sake of concreteness, the 
set of closed contours (making up a composite contour), some of which 
may be self-intersecting ones, for any given spin configuration may be 
converted into one made up of simple (i.e., non-self-intersecting) closed 
ones by the convention indicated in fig. 6-11(A). Fig. 6-11(B) depicts an 
application of this procedure, where only a part of the total spin configu- 


ration in the direct lattice is shown. 


In other words, any given spin configuration in the direct lattice gives rise 
to a set of simple closed contours, say, {71,72,::: , yr} (r = 1,2,---) where 


the length (i.e., the number of dual lattice edges) of y; (i = 1,2,--- ,r) is, 
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(A) (B) 
Figure 6-11: The convention of converting self-intersecting closed polygons in the 
dual lattice to simple closed contours; (A) a vertex of a self-intersecting contour that is 
the meeting point of four edges (only these edges are shown) is to be clipped at the point 
of intersection so as to produce two edges each of simpler contours; (B) an instance 
where the rule is applied so as to convert (shown by arrow) a self-intersecting polygon 


(part of the composite contour corresponding to some specified spin configuration in the 
direct lattice) into two simple closed contours. 


say L;. Then the energy of that configuration is given by 


H=-BJ+2I5 0 Li, (6-50b) 


i=1 


where the number (r) of simple contours and the lengths 1, lz,--- , L, 
depend on the configuration under consideration (and for the chosen 
boundary condition 7). Considering, then all possible spin configura- 
tions in the direct lattice, one obtains the following expression for the 


partition function 


Gr) = ery lec! (K = BJ), (6-51) 
{s} {7} 


where the summation is over all possible spin configurations ({s} = {s1, s2,--- 


denotes a typical configuration) and, given any specified spin configu- 
ration, [],,,; denotes a product over all the closed contours in the dual 


lattice (such as {7,72,:-: ,y,} mentioned above) resulting from that con- 
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figuration, while L(y) is the total length of the edges making up all these 
closed contours (i.e., L(y) = 5>;_, £; in the example mentioned above). The 
superscript ‘(+)’ tells us that the boundary condition 7, is being referred 


to (the boundary condition 7_ leads to analogous expressions). 


Formula (6-51) constitutes a useful low temperature representation (K >> 
1) for the partition function pertaining to a finite Ising lattice. The argu- 
ment, originally put forward by Peierls, for the existence of a phase tran- 
sition in the 2-dim Ising model is based on such a representation (the 


argument carries over to Ising models in dimensions D > 2). 


6.2.4.3 Existence of phase transition: Peierls’ argument 


The probability of a spin configuration {s} under the boundary condition 


n+ is given by 


P({s}) a = I] ee (6-52) 
{7} 

where the product on the right hand side is over all the simple closed 
contours associated with {s}. This expression can now be made use of in 
evaluating the probability that the spin at any secified site has the value 
—1 under the boundary condition 7;. Assuming that the finite lattice 
under consideration is a square array A,, with opposite corners at (—n, —n) 
and (n,n) (where the co-ordinate axes are parallel to the lines joining 
adjacent lattice sites; (n = 1, 2,---)), we choose for the sake of concreteness 


the site at the origin, in which case, following a course of reasoning as 
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in [41], one obtains 
PCO (sy = —1) = oe 27720), (6-53) 
Y 


where the summation is over all simple closed contours y that enclose the 
origin, i.e. the point chosen within the finite lattice A,, and L(y) stands 
for the length of a contour 7 (i.e., the total number of edges of the dual 


lattice that the contour is made of; the minimum value for L(y) is 4). 


Peierl’s argument essentially tells us that for sufficiently large 3, while 
the number of contours y proliferates with increasing length L, the fac- 
tor e~?8/4) dominates in making the above sum converge to a value that 
goes to zero as § — oo. Since the lattice site under consideration has been 
chosen at the origin in a two dimensional co-ordinate system solely for 
the sake of a specific illustration and is otherwise arbitrary (the choice 
of origin does not matter in the thermodynamic limit), this establishes 
the existence of a phase transition wherein the probability of an arbi- 
trarily chosen site to have a spin value —1 is seen to be arbitrarily small 
for 8 — 0 under the boundary condition 7,, implying that (a9) > 1. 
This is indicative of a transition to an ordered regime from the disordered 
(‘paramagnetic’) regime for small 8 where we found that (o)‘*) — 0 un- 
der the same boundary condition. In the low temperature regime, the 
spin-flipped boundary condition 7_ would give (c9)‘ — —1. implying that 
the equilibrium state is dependent on the boundary condition. The above 
considerations tell us that there exist at least two equilibrium states at 
any sufficiently low temperature that differ in the value of the order pa- 


rameter (oo), of which the boundary condition chooses one. The latter is 
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a quantity that depends sensitively on the probability distribution over 
microstates. The probability distribution differs for the two equilibrium 
states, though the thermodynamic quantities like the specific free energy 


have uniquely defined values, independent of boundary conditions. 


1. Put differently, Peierl’s argument tells us that the two equilibrium 
states are small perturbations over the two possible ground states (one 
with all the spins in the ‘+’ state and the other with all spins in the *-’ 
state, for an infinitely extended system; in the case of a finite system 
the boundary condition chooses a unique ground state). Choosing the 
boundary condition 71, one obtains an equilibrium state where the ‘-’ 
spins are rare and, considering a state in which the spin at the origin 
has a value —1, the closed contours in the dual lattice enclosing the 
origin for this configuration are few in number and contours of large 
length are relatively rare, i.e., ‘-’ spins are like small islands in a sea of 
‘+’ spins. 

2. The two equilibrium states mentioned above are not the only ones pos- 
sible in the low temperature regime. For instance, the Dobrushin 
boundary condition np (see fig. 6-8) leads to an equlibrium state that 
is a mixture of two pure phases. The full structure of the equilibrium 
states in the low temperature regime is obtained by referring to the 


theory of Gibbs states for the infinitely extended lattice (see sec. 6.2.5). 


6.2.4.4 The Kramers-Wannier duality principle 


Peierls’ argument establishes that a phase transition occurs at some low 
temperature (corresponding to 6 = 6. < co), but does not locate that 
temperature. While an estimate in the form of an upper bound can be 
obtained by a careful and judicious application of the Peierls argument 


(see [41]), an alternative (and closely related) approach due to Kramers 
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and Wannier locates the exact critical temperature (T, = (kp{.)~') by a 
duality argument in which the high- and the low-temperature represen- 
tations (formulae (6-48) and (6-51)) of the partition function are compared 


by referring to the dual lattice. 


Recall, first of all, that the high temperature representation (6-48) applies 
to the ‘+’ boundary condition (7,). The corresponding representation for 
the open boundary condition is conveniently obtained by referring to the 
dual lattice. Given the finite square array A,, we consider the square 
array in the dual lattice of minimal size enclosing A, (call it A*). If N* 
denotes the number of lattice points in A* and B* the number of edges 
(analogous to bonds in A,,) then, at any given temperature 7* = (kg6*)7', 
one obtains the following expression for the partition function (for the 


ising model defined on the array A*) under the open boundary condition 
grtoren) — 2" cosh(B* J)?” > tanh(A* sr), (6-54) 
G* 


where G* denotes a typical closed contour in the dual lattice (square array 
A*) such that any vertex is a meeting point of 0,2 or 4 edges, the summa- 
tion being over all such closed graphs. It turns out that any such closed 
graph coincides with the edges of the contour (in the dual lattice) corre- 
sponding to a unique spin configuration (in A,; recall that such a contour 
in the dual lattice, in turn, can be converted into a set of simple closed 


contours as indicated in fig. 6-11). Hence, if 6 and 5* are related as 


tanh(3* J) = er. (6-55) 
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then the sums on the right hand sides of (6-51) and (6-54) become equal, 


and one obtains 


Zxopen Zt 


2N*(cosh(B*J)2")  ebJB” ae 


We recall that, in this equation, Z*°?" and Z* are partition functions 
for the 2-dim finite ising lattice corresponding to square arrays of lattice 
points A*, A, respectively, the former under the open boundary condition 
and the latter under the ‘+’ boundary condition. The number of lattice 
points in the direct lattice is N = (2n + 1)? while that in the dual lattice 
is N* = (2n+4+ 2)*. Similarly, B, B* denote the number of edges in the 
direct lattice (including edges connecting to the ‘+’ spins constituting the 


‘+’ boundary condition), and the dual lattice respectively. 


It is straightforward to verify that in the thermodynamic limit (n + co, N > 


oo), one has 


Se aly (6-57) 


by construction of the dual lattice. Thus, taking the logarithm of both 
sides, of (6-56), dividing by NV, and going over to the thermodynamic limit, 


one obtains 
1 1 
: *(open) _ (+) _ sei * * * _ > 
sim. vlinZ In Z*"?] jim lL In2+ B* Incosh(8*J) — GSB], (6-58) 


whereupon, making use of (6-55), (6-57), and also of the result that the 


specific free energy f = —6~'; InZ is independent of the boundary condi- 
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tion in the thermodynamic limit, one arrives at the following result 


—[6-*F(8") — 6°" F(8)] = Insinh(26*J) (h = 0), (6-59) 
(check this out). 


In writing the above formula we emphasize that the field strength h has 
been assumed to be zero in our present considerations. The formula 
itself tells us that, in the thermodynamic limit, the values of —/f (the 
specific free energy of the 2-dim Ising spin system) at temperatures T* = 
(kp6*)—' and T = (kg@)~', where £, G* are related as in (6-55), is an analytic 
function of 5* (and hence of (). In other words, if f is non-analytic at some 
value of the parameter 3, say, at 6 = (., it has to be non-analytic at the 
corresponding value of (* (call it 6*.) as well. Assuming, then that f is 


non-analytic at one single temperature, the value of 3. has to satisfy 
Bo = Pas (6-60a) 
i.e., from (6-55), 
B.J = = In(1 + /2), (6-60b) 


(check this out as well). 


This result is borne out by Onsagers derivation of the specific free energy 


of the Ising system in the themodynamic limit. 


In the context of models of magnetic systems, the thermodynamic limit 
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In Z(N, 8,h) is actually to be looked upon as the analog of the grand po- 
tential that was introduced earlier for a fluid, where 6 = (kgpT)~!, N is the 
number of spins, and / represents the field strength (the statement that the 
thermodynamic limit exists for the system then means that this quantity 
is asymptotically proportional to N, regardless of the boundary condition). 
Accordingly, limy_,.. # In Z(N, B,h) is referred to as the pressure p(8,h) of 
the system, though the physical interpretation of this quantity differs from 
that of pressure in the case of a fluid (refer to [46], chapter 5). In the special 
case of h = 0 that we have been considering here, the quantity —8-!4 nZ 
is referred to as the specific free energy (f), which is an analytic function of 
@ except at a temprture where a phase transition takes place (refer back to 
sec. 5.7; there may be several such temperatures; in the case of the Ising 


system there is actually only one). 


Incidentally, in the case of lattice systems, a finite system is commonly 
defined by specifying a finite set of lattice points (say, A) in an infinitely 
extended lattice rather than by the number of lattice points (N), the latter 
being useful of in the special case that A is of a simple structure (like that 


of an array A, considered above). 


In the next section we look at the basic idea underlying the transfer ma- 
trix method in Onsager’s derivation of the specific free energy which con- 
structively establishes that the 2-dim Ising model admits of a phase tran- 


sition at one single temperature that satisfies (6-6Ob). 


6.2.4.5 2-dim Ising model: Onsager’s result 


In this section I will present, following [67], the basic idea underlying 


Onsager’s derivation of the specific free energy, in the thermodynamic 
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limit, of the 2-dim Ising system in zero field, where the partition function 
is evaluated by the transfer matrix method. While the application of the 
transfer matrix was seen to be trivial in the 1-dim case, the 2-dim sit- 
uation is far less so, and, as elsewhere in this book, we present only a 
summary outline of what the derivation entails without proofs of most of 


the statements involved. 


In this section I take upon myself the job of putting together, in clear and 
concise terms as best as I can, the principal links in the chain of complex 


and subtle arguments that you will find in greater details in [67]. 


We begin by referring to a n x n (n = 2,3,---) Square array of spins, where 
the array will be looked upon as being made of n rows of spins, the vaious 
rows being marked with a super-index a(= 1,2,--- ,n). The configuration 
of spins in the entire array (recall that a spin o is an object that can take 
up two values s = +1) will be specified in terms of the configurations in 
the various rows, where the configuration for row a is denoted by pu‘) = 
{5 5)... ,s()}. At times, the super-index a will be suppressed for the 


sake of brevity, and we will simply refer to the configuration jy of a row. 


The interaction between the spins constituting the system is specified in 
terms of the interaction Hamiltonian which is basically built up from the 
Hamiltonian H(c,, 02) of two spins such that the energy for a configuration 


{s1, 52} of the spins is E(s1, 52), given by 
2 
E(s,, S2) = —J8189 = » hs;, (6-61) 
all 
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which holds for spins that are nearest neighbours of each other, the en- 
ergy of all non-nearest pairs being zero. In this expression, /(> 0) is the 
coupling constant between spins, and / stands for the ‘magnetic field’ 
(recall that the Ising model is a fictitious one, though it is of relevance 
in explaining the behavior of real life systems). In the following, we will 


continue to consider the zero-field case h = 0. 


As a matter of notation, it proves to be convenient to look upon the inter- 
action energy F(s), 52) as the matrix element of the Hamiltonian H in a two 
dimensional space, and denote it by the symbol (s;|H|s2). The advantage 
of this notation lies in the fact that the energy of a configuration of multi- 
ple spins can be expressed as a matrix element in a composite space that 


is a direct product of two-dimensional spin spaces. 


In the following, we will assume that the interaction between spins is 
subject to periodic boundary conditions which means that the array of 
spins is imagined to be wrapped around in such a manner as to make 
spins in the nth and the first rows interact as nearest neighbors, while 


the same holds for the spins in the nth and the first columns as well. 


If «( and p™ denote the spin configurations of two adjacent rows (i.e., 
y=a+1(a=1,2,:--,n), with pb) = wp, pO = p™), then their contri- 


bution to the energy of the entire array is of the form 


E(u, pw) = FO (ui, pM) ai E(u), (6-62) 


where E(1.°°)) stands for the energy of interaction of the spins in the row 
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a considered independently of the rows adjacent to it, and L(y, pu) 
is the mutual energy of the two rows, arising from the interaction be- 
tween adjacent spins belonging to the two (only one of the two energies 
E(u), E(w) is included in the above expression in order to avoid dou- 
ble counting in the expression for the total energy of the entire lattice). 


Thus, with h = 0, 


=I cee S1,25+> 5 n; = = si”), 


ED (py =-J yee (6-63) 
where si “) stands for the value (, or —1) of the spin in row a and column k 


(hac 1 Doe: ,n) 


A spin configuration of the entire array can be denoted by {pi"), 1), --- 5 3, 
in terms of the configurations of the individual rows pi‘°) (a = 1,2,--- ,n). 


The total energy of the array is then of the form 
EW. uO) = SEW, we) + BW] (WO =p). (6-64) 


a=1 


The partition function of the spin system at temperature T = (kp)! 


then given by 


Z=d 7 Dep ABU, --- ,w)], (6-65) 


pO) pl?) 


where }/ (2) denotes a summation over all possible spin configurations of 


the row a (a = 1,2,--- ,n). 
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As in the case of the energy of a pair of spins (F(s;, s2)) one can formally 
look at the expression exp [—8 E(u), “)| as a matrix element of a transfer 


matrix P (say) in a 2"-dimensional product space: 


(uO | PM) =(s, 8)... , 8 Pls, 8,--- 8) 


=p (60a a) ee). (6-66) 
It is then straightforward to see that, analogous to the 1-dim case, 
Za Te? (6-67) 


(check this out; make use of the periodic boundary condition; however, 
formula (6-67) requires only the periodicity for the rows, where the top 
and the bottom rows are assumed to be adjacent; the periodicity for the 


columns is necessary for convenience in the diagonalization of P). 


If the eigenvalues of P (2” in number) are denoted by )j, A2,--- , Ag, then 


one obtains the partition function as 
Qn 
Z= Sod. (6-68) 
p= 


Since the energy of a pair of adjacent rows grows like n (reason this out), 
one can estimate the partition function in the thermodynamic limit by as- 
suming that lim,-,. + In Amax iS finite, where \m~x stands for the maximum 
of the eigenvalues of P. Assuming further that the eigenvalues of P are 
all positive (both these suppositions turn out to be true, as seen from the 


results to be stated later), and making use of the relation N = n? between 
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the number of rows or columns and the number of lattice points in the 


array, one arrives at the result that, analogous to the 1-dim case, 


lim ~mnZ = lim is Nepsee (6-69) 


N-oo noo 1 


The rest of the theory goes to the diagonalization of the transfer matrix. 
The basic idea is to express P, of dimension 2” (n large), in terms of 
direct products of simpler matrices whose dimensions are small. One 
can check by direct substitution that (refer to [67] for details; recall that 
our considerations in the present section are confined to the zero-field 


case h = 0) 
P = [2sinh(28J)]2 VV, (6-70a) 
where the matrices V,, V2 are defined below. Defining 
V=aVN, (6-70b) 


and denoting the largest eigenvalue of V by A one obtains, from (6-67), (6-69), 
and from the above definition of P in terms of V, 

i 1 . il 

lim VW InZ = 5 In[2sinh(26J)]+ lim —InA. (6-71) 


Noo no nN 


We now define V,, V2 so as to proceed to the problem of diagonalization of 
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V = V2V\. This is done by referring to the Pauli spin matrices 
X= ’ a ; Z= : (6-72) 


Making use of these 2 x 2 matrices, we define 2” x 2” matrices X,, Z, (v = 
1,2,--- ,n), each as the ordered direct product of n number of 2 x 2 matri- 
ces, as follows: in X,, all factors excepting the vth one are unit matrices, 
while the vth factor is X; and similarly, in Y, (resp., Z,), all factors ex- 


cepting the vth one are unit matrices, while the vth factor is Y (resp.,Z). 


With these definitions of X,, Y,, Z, (v = 1,2,--- ,n), Vi, V2 are defined by the 


expressions 
a ler (tanh @ = e747) VY, = peers (Zn4i = Z1). (6-73) 
v=1 v=1 


The next step in the diagonalization of V = V2V, is to refer to a set of 
2” x 2” matrices I, (uw = 1,2,--- ,2n) defined in terms of the following anti- 


commutation relations 
Dil ye Cel y= 20 gp-( eo = Ds (6-74) 


For any given value of n(= 1,2,---), there exist various possible represen- 
tations or realizations of the I',,’s by means of 2” x 2” matrices, all related 


by similarity transformations. One such representation is specified as 
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follows: 


Pop =X,Xo i aw X,1Zy (v = 1, 2, Aes sit) 


To, =X,Xo ne -X 1Y, (Vv _ 1, 2, settee 1), (6-75) 


where the matrices X,, Y,, Z, have been defined above in terms of direct 
products involving the 2 x 2 Pauli matrices. One can now check that, in 


this representation, the matrices Vj, V2 are given by 


Vo= I] e Talay (tanhé = e 7%), 
v=1 
n—-1 
Vo = eB JUT iT 2n I] e BIT e410 av (U =P cure Xe), (6-76) 


v=1 


Two new matrices V* are now defined as 


n-1 n 
yo ~ sce Le Ler), (6-77a) 


v=) A=1 


in terms of which V = V2V, can be expressed in the form 
1 Ses orl = 
= a(t +U)VT + a(t —U)V', (6-77b) 


where J stands for the 2” x2” unit matrix. The matrices U,V*,V~ commute 
with each other. One can check that half the eigenvalues of U are +1 while 


the other half are —1. One can make a similarity transformation whereby 
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U is transformed to the diagonal form 


P ce 
= : (6-78) 
0 -T' 


where J’ stands for the 2”~! x 2”—! unit matrix. V* are then transformed 


into block diagonal forms V+. The eigenvalues of these matrices deter- 


mine the largest eigenvalue of V. However, the eigenvalues of V+ coincide 


with those of V*. These matrices will be referred to below. 


Referring back to the matrices I, (uw = 1,2,--- ,2n), one can define a simi- 
larity transformation in the space of 2”-dimensional matrices whereby [,, 


goes to 


2n 
C= yale, (6-79) 
v=1 


where w,,, are the elements of an orthogonal rotation matrix (w) in the 
2n-dimensional space in which the 2”-dimensional spin matrices [,, (4 = 
1,2,--- ,2n) are defined. The transformation of the matrices I, may be 
looked upon as a rotation S in the 2” dimensional space, and there is a 
one-to-one correspondence between the rotations w and S. An elementary 
rotation w\“”)(@) in the p-v plane (u,v = 1,2,--- ,2n) by an angle 6 induces a 
corresponding elementary rotation in the 2”-dimensional spin space given 


by 
SH(9) = en 20 uTy (6-80) 


If one now considers the product (w) of elementary rotations in n distinct 
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planes in the 2n-dimensional space through angles 6, 62,--- 


,9,, then w 


induces a corresponding rotation S(w) in the 2”-dimensional spin space. 


For such a composite rotation one obtains the eigenvalues of w as e 


Oo 


while those of S(w) are seen to be e2(*% 


20 105 
lie 2 


’») where, in the last expres- 


sion, one chooses the + signs independently of one another so as obtain 


the 2” eigenvalues. 


It turns out that the matrices V~ defined in (6-77a) are composite rotation 


matrices of this type in spin space induced by 2n-dimensional rotations 


eigenvalues of wt are e*” (k =0,1,2,--- ,2n—1), 7, being given by 


w (these involve elementary rotations by imaginary angles), where the 


k 
cosh y, = cosh(2G./) cosh(26) — cos _ sinh(23./) sinh(20) (tanh 20 = e~?°”). 


The numbers 7; satisfy 


Vk = Yen—k (k =1,2,---,n—-1), 0<y<W°°° <n; 


and the eigenvalues of V* can be expressed in terms of these as 


eigenvalues of V~: e2! 


YO 


ait 


71 


jan 


Yan 1) 


eigenvalues of VT: e2( 


yan—2) 


d 


(6-81) 


(6-82) 


(6-83) 


where the + signs are to be chosen independently of one another. These 


constitute a set of 2 x 2” number of eigenvalues, and the eigenvalues of 


V belong to this set. However, one needs only the largest eigenvalue in 
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order to obtain the specific free energy in the thermodynamic limit (refer 


to (6-71)), which can be obtained as follows. 


Incidentally, (6-83) tells us that the assumptions that were stated earlier as 


being necessary for the validity of (6-69), are justified. 


One considers the representation of the spin matrices in which U appears 


in the form (6-78) and V* are transformed to V+, and then diagonalizes 


the matrices $(J + U)V*, thereby obtaining the diagonal matrices, say, 


V5. calling A* the largest eigenvalues of V,, the largest eigenvalue of 
V is obtained as A = max{A*,A7~}. It turns out that \*,A~ are the largest 
eigenvalues belonging to the sets of eigenvalues of V~ and V* respectively, 


i.e., 


re ex otrat+72n—2) fo en atist +2n-1) (6-84) 


(an ambiguity arising in these expressions disappears in the thermody- 
namic limit). It is then finally seen (by making use of (6-82)) that the 


required largest eigenvalue of V is 


As en ntystetyan—1)_ (6-85) 


Referring now to the formula (6-71) the second term on the right hand 
side reqires the evaluation of the sum appearing in (6-85) which, in the 


thermodynamic limit, reduces to an integral, and one arrives at the result 
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that the thermodynamic limit 


ean In Z, (6-86a) 


N->oo 


for the specific free energy exists and is given by 


— 5 In(2 sinh(26J)) — aig [ dy, ui dipz In[2 cosh(28.J) coth(26.J) — 2(cos q1 + cos w2)I, 
(6-86b) 


an alternative form for the same being 


f =—67*[m(2 cosh(28J)) + ral dw in{5(1 +4/1—x?sin?w)}],  (6-86c) 


where 


_ 2sinh(26J) : 
oa cosh?(238J) oe!) 


Recall that, in the context of magnetic lattice models, the limiting expression 
limy-oo 7 In Z for any specified value of the field strength h, not necessar- 
ily zero, is referred to as the pressure; though its physical significance is 


different from that of a fluid. 


Making use of the above expression for the specific free energy (i.e., — 37! 


times the pressure at h = 0), one can derive an expression for the internal 
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energy per site (again, at h = 0) given by 


. Li O(Cf) 
—(Q)= —=)0- Se ) cee _ oe NS 
u(h =0) = Jim . oe sot gp exp —B[ E(u’, --- , w)| : 
u pe 
(6-87a) 


(refer to eq. (6-86a); for notation, see formula (6-65)), which works out to 


dp 


(6-87b) 
1— y?sin* ¢ 


u = —J coth(26J)[1 + “(2 tanh?(26J) — 1) [ 
0 


where x is given by formula (6-86d), and where the integral on the right 
hand side is the complete elliptic integral of the first kind: 


(6-87c) 


Ka) = [ —_— 
0 V1l—x?sin*¢ 
The complete elliptic integral K,(y) has a logarithmic singularity at y = 1 
or, equivalently, at y’/ = \/1— x? = 0. In the present context, using the 
value (6-86d) for y, one obtains ’” = 1 — y? = [2 tanh?(28,/) — 1]?, and thus 
the singulatity occurs at 6 = 6, (corresponding to T = T, = (kpG7')) given 
by 


1 1 
Bo = aT aEeaP (6-88) 


which agrees with the value given by (6-60b). 


In spite of the singularity of K,(y), the free energy per site (f) is continu- 


ous in 6 = @., and so is the specific internal energy (u), the latter because 
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Cc 


T. 


c 
Figure 6-12: Depicting schematically the logarithmic singularity of the specific heat 


c= ~- at T., where the complete elliptic integral of the first kind (eq. (6-87c)) possesses 
a singularity. 


of the presence of the factor (2tanh?(28.J) — 1) in the expression (6-87). 
However, the singularity shows up as a logarithmic divergence in the spe- 
cific heat, which involves a derivative of f, of the next higher order (i.e., of 


order two). This is depicted schematically in fig. (6-12). 


Singularities in the specific heat (of the nature of a discontinuity) are pre- 
dicted in the Bragg-Williams and the Bethe-Peierls approximations to the 
2-dim Ising model (see [67], chapter 14), the former being essentially the 


mean field theory of the Ising system. The estimates for the critical temper- 
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ature obtained in these models differ from the exact value resulting from the 
Kramers-Wannier duality argument and the transfer matrix approach. Of 
the two approximations mentioned, the Bethe-Peierls approximation gives 
a better estimate of T., namely, 8.J = $In2 for the 2-dim Ising model on a 
square lattice. The Bragg-Williams value is the same as that given by for- 
mula (6-29) which gives, for the 2-dim square lattice Ising model, {,J = $. 
Finally, there exists an exact result for the spontaneous magnetization 
(i.e., the mean value of the spin at any given lattice site) for the 2-dim 
ising model, but this is of a fundamentally different nature than the re- 
sults obtained in the present section for the special case h = 0. Contrary 
to the specific free energy, specific internal energy, and the specific heat, 
all of which have uniquely defined values at any given temperature (dis- 
counting the singularity of the specific heat at T.), the magnetization is 
non-unique for T < T.. In order to calculate the spontaneous magnetiza- 
tion, one needs to calculate the free energy at a non-zero value of the field 


strength h and then consider the limit h —> 0, when it is found the right 


hand and the left hand limits (h — 0*) differ from each other when T < T,. 
In the mean field theory, these values are obtained from (6-28). The exact 
result for the 2-dim Ising model in the thermodynamic limit was worked 


out by Yang (see [67], chapter 15) and is given by 
m = +[1 — (sinh(26J))~4]® (8 > Bz), (6-89) 


where the two possible values at any given teperature T < T, are indi- 


cated. As we will see in the next section, the spontaneous magnetization 
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for h = 0 can actually have a host of possible values (when defined as 
the expectation value of the spin at any specified lattice site) depend- 
ing on the Gibbs state of the infinitely extended system. Starting from 
a finite system with any specified boundary condition and with h = 0, 
and then going over to the thermodynamic limit, the magnetization is 
found to depend on the boundary condition chosen. The values in (6-89) 
are obtained for the ‘+’ and ‘-’ boundary conditions (7;,7_ mentioned in 
sec. 6.2.4.1) respectively. In contrast, the Dobrushin boundary condition 
(np) gives a value m = 0, while other boundary conditions give values of m 
intermediate between the two given above. However, if the magnetization 
is worked out with h # 0 and then h is made to go to zero, one ends up 


with the two values given by (6-89) 


6.2.5 Gibbs states for the infinitely extended Ising lat- 


tice 


The present section is based on chapters 3 and 6 of [41]. While most of the 
statements made below will be informal in nature, more precise statements 
and proofs will be found in Friedli and Velenik’s book - one of great value. 

The general idea of Gibbs states for infinitely extended systems has been 
briefly explained earlier in chapter 5, sec 5.6, with reference to a number 
of points of view. Among these, the DLR approach will be taken up in the 
present context, in sec 6.2.5.5. Before that, we consider below the Ising 


system in the thermodynamic limit. 
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6.2.5.1 Finite volume Gibbs distributions: recapitulation 


Before going over to Gibbs distributions defined for an infinitely extended 
lattice, it is worthwhile to summarize in an informal manner a number 
of relevant aspects relating to probability distributions for a finite lattice 
and to states defined from such distributions in the thermodynamic limit. 
These considerations are relevant in that they help us define the charac- 
teristic features of phase transitions and also provide us with pointers to 
the theory of Gibbs states or, what amounts to the same thing, Gibbs dis- 
tributions, for an infinite system. The present section will mostly be in the 
nature of recapitulations (with a bit of new notation thrown in) of ideas 
and results we have already met with, and will be stated in the general 
setting of a D-dimensional lattice (D > 2). The theory of infinite volume 
Gibbs distributions has to deal with a number of mathematical problems 
before a satisfactory formulation is arrived at. In this book, however, 
we will not engage with these mathematical issues and will try to gaina 
preliminary understanding of infinite volume Gibbs distributions in less 


technical terms (refer back to sec. 5.6). 


A finite volume Gibbs distribution is defined over a finite set of lattice 
points A belonging to an infinitely extended D-dim lattice, the structure 
of which will not be made explicit (it may prove to be useful, however, to 
think in terms of a 2-dim square lattice or a 3-dim simple cubic lattice 
so as to make a mental picture of things). The number of lattice points 
in A will be denoted by Ny. The thermodynamic limit will correspond 
to Na, — co in the sense of Van Hove which implies that the number of 


boundary points goes to zero in comparison with N, (a boundary point is 


565 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


one for which there exists at least one nearest neighbor not belonging to 


A). 


A spin configuration in A (denoted by, say, w,) is a possible assignment of 
spin values (+1 or —1) at all the N, sites, there being 2“* such configu- 
rations, the latter constituting the set Q,. In contrast, a configuration in 
the infinite lattice (to which A belongs) is an infinite sequence specifying 


the spin value at each and every lattice site. 


A Gibbs distribution on the finite lattice A is a probability distribution 
over the set 2, of spin configurations subject to some boundary condition 
n (see below; refer back to sec. 6.2.4.1) assigns a probability P\)(w,) to 


every configuration w, as 
1 
PA” (wn) = Fay exp [BHA (wa)] (6-90a) 
A 


where H”) (w,) denotes the energy of the configuration w, under the bound- 


ary condition 7, and the partition function ZO is given by 
ZX = S~ exp[-BHX” (wn)], (6-90b) 
WAEQA 


the summation in the last formula being over the set of all possible con- 
figurations in Q,. As usual, 6 is related to the temperature as 3 = (kpT)7!. 


The field strength h is left implied in formulae (6-90a), (6-90b). 


Observe that, in (6-90a), one need not consider the interaction energy among 


the spins in A‘, the complementary set of A in the infinite lattice, since this 
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energy gets canceled in the ratio on the right hand side. In other words, one 
need consider only the interaction energy among the spins within A, and 


that between spins in A and A‘, i.e., the boundary energy. 


In the above expressions, HW (wa) represents the energy of the spin con- 
figuration w, in A subject to the boundary condition 7. We will occa- 
sionally desist from referring explicitly to the boundary condition 7 or to 
the finite set A in the mathematical formulae for the sake of simplicity, 


leaving these to be implied by the context. 


A few of the more commonly invoked boundary conditions have been 
mentioned in sec. 6.2.4.1 (generalizations to dimension D > 2 are straight- 
forward). In addition to (or, more precisely, in generalization of) these, one 
can define a configurational boundary condition, specified by a configura- 
tion, say, 7. of the infinitely extended lattice (to which A belongs), so that, 
in specifying a configuration wy, it is to be assumed that the configuration 
of spins in the complementary set A. (made up of sites not belonging to 
A) remains fixed at that implied by 7... For a given configuration 7,., this 
definition has the advantage that one can choose the finite set A arbitrar- 
ily, such that a configuration w, can be defined subject to the boundary 
condition that the spins in A, are held frozen in the configuration implied 


by 7. (this defines the restriction of 7. to A., denoted by 7..|,_). 


It is straightforward to see how the boundary conditions mentioned in 
sec. 6.2.4.1 can all be looked upon as special instances of such con- 
figurational boundary conditions. Thus, for instance, the open (or free) 


boundary condition 7 corresponds to an 7, that can be chosen arbi- 
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trarily, while the boundary condition 7, (resp., 7_) corresponds to an 7, 
where all spins are in the ‘+’ (resp., ‘-’?) state so that, for any given A, 
all spins in A, are held frozen in this state, and only the states of spins 


within A are allowed to vary from one configuration (w,) to another. 


The expectation value of any given function f of the spin variables (for 
instance, the spin op at any specified site chosen as the origin, or the 
correlation 0,0; between spins at sites marked 7 and j) under a probability 
distribution of the type (6-90a) is denoted by (jf). The dependence of (/) 
on A, 7, h, and T (or, equivalently, 6) is often left implied, one or more of 


these being mentioned as demanded by the context. 
6.2.5.2 The thermodynamic limit: pressure and magnetization 


The pressure. 


Given a set of lattice points A and a boundary condition 7, the pressure 


as a function of 6,h is defined as 
(n) _ 1 : 
Py (8h) = 5 In Za. (6-91) 
A 


In the case of the free boundary condition (7), or the periodic boundary 
condition (7p), the pressure turns out to be an even function of h while, 


in the case of the ‘+’ and ‘-’ boundary conditions (7,,7_), one has 
pr (6, —h) = p (6, h). (6-92) 


Moreover, the pressure is a convex function of (3, h). 
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The existence of the thermodynamic limit means that the pressure p is a 
well defined function of @,h in the limit N, — co (as A is made to grow 
in the sense of Van Hove so as to cover the entire infinite lattice), being 
independent of the boundary condition. Moreover, this function p({,h) is 


convex and is even in h (refer to [41] for proof). 


The magnetization. 


The magnetization is represented by the random variable 


c. 
m{o}=— Y oj, (6-93) 
(b= 5h 
where the spins at lattice sites are assigned labels 1,2,--- ,N, in some 


appropriate manner. The mean value of this variable under a probability 
distribution of the type (6-90a) will be denoted by (m), with its dependence 
on A,7 and on 3,h implied. Making explicit the dependence on A with the 
help of a suffix (to remind us that the thermodynamic limit is yet to be 
gone over to), one can check that 

(mn), = PA. (6-94) 
The magnetization, and hence the pressure, is a key factor in indicating 
the occurrence of a phase transformation when the thermodynamic limit 
is taken. However, the question arises as to whether the formula (6-94) 
holds in the thermodynamic limit. This depends on whether, in the ther- 
modynamic limit, the pressure p(3,h) = limy,+4.pa(G,h) is differentiable 


with respect to h, i.e., whether the left hand and right hand derivaties ex- 
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ist and are equal. It turns out that, for a given (, the left hand and right 
hand derivatives of the pressure in the thermodynamic limit are equal, 


save for a countable number of values of h. 


For the non-exceptional values, the thermodynamic limit of (m), exists, 
is independent of the boundary condition, is given by (6-94) (with the 
suffix A removed), and is a continuous, non-decreasing function of h. 
For the exceptional values, on the other hand, the thermodynamic limit 
((m), — (m)) is discontinuous in h, though its right hand and left hand 
limits agree with the right hand and left hand derivatives of the pressure. 
The right hand limit (chosen for the sake of concreteness; the left hand 
limit would do equally well) limp,_,9+ exists for all 6 and is referred to as 
the spontaneous magnetization (m*((3)). As the formula (6-89) tells us, for 


the 2-dim Ising model, a discontinuity does arise at h = 0 for 8 > (.. 


Phase transition: characteristic features. 


This leads us to one of the characteristic features of a phase transition: 
tthe system under consideration exhibits a first order phase transition if 
the pressure fails to be differentiable at some value of 3,h. This is comple- 
mented by the statement that a phase transition corresponds to a point 
of non-analyticity of the specific free energy at some temperature T = T., 
the two functions being related in the thermodynamic limit as p = —/f. 
As we have seen, these two characterizations of a phase transition imply 
a third, equivalent characterization whereby the magnetization becomes 
discontinuous and multi-valued as h — 0 for 6 > 4., the value of 3, having 


been located exactly for the 2-dim case in earlier sections. These fea- 
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tures relating to the magnetization are borne out in the mean field theory 


outlined in sec. 6.2.3. 


6.2.5.3 Expectation values of local oservables: infinite volume Gibbs 


states 


The pressure or the specific free energy relate to the thermodynamic prop- 
erties of a system but do not tell us anything more about the detailed sta- 
tistical features of its state. These are determined by the latter, but are 
only sums over the phase space (i.e., the set of all possible configurations 
in the present context) of the probability distribution (refer to (6-90a) and 
imagine the set A to be infinitely large), and it is the probability distribu- 
tion that ultimately characterizes the state. It thus becomes imperative 
to understand the probability distribution for infinitely large volumes, re- 
ferred to as the Gibbs distribution. The two terms, ‘Gibbs distribution’ and 
‘Gibbs state’ describe the same thing from two complementary points of 
view, which we now discuss. We will look at these in the specific context 
of an infinitely extended system (the Ising system, to be specific), though 
these remain relevant for finite systems as well. The present discussion 
will complement the ideas introduced in section 5.6 in the context of a 


simple fluid. 


The probability distribution gives the most complete possible description 
of the state which determines such details as the fluctuations and cor- 
relations among the spins of the infinitely extended lattice. In turn, the 
probability distribution itself is determined completely by all possible ex- 


pectations values of local observables like the correlations between ar- 
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bitrarily fixed spins in the lattice. In other words, the state is equally 
well described by the totality of all expectation values of local observ- 
ables, this being the second of the two viewpoints referred to above. In 
particular, the state is completely determined by the set of correlations 
{(0;,0i, +++ 0:,)} for all possible n(= 1,2,---) and, for each n, for all possible 


choices of the sites marked 7, 72,--- in. 


Let us consider the set of all possible local observables A, each of which 
depends on a finite number of spins. A state is then defined as a linear 
mapping from each A to a real number (A), to be interpreted as the ex- 
pectation value of A. For instance, if A denotes a finite set of lattice points 


marked with the index i(= 1,2,---N,) then 
Lee 
ma = 5 d. Oi, (6-95) 


is a local observable. In order that the mapping A — (A) may describe a 
state, it has to be, in addition to being linear, a positive mapping and has 
to satisfy (J) = 1 (normalization), where / statnds for the unit oservable 
(one whose value is unity for all possible configurations of spins in the 


infinitely extended lattice). 


One can now state the relation between finite- and infinite-volume states 
in terms of expectation values: consider a local observable A, for which 
the expectation value (A), under the finite volume distribution (6-90a) 
(specified for some boundary condition 7, and for given values of £,h) 
tends to a well defined limit (A) as A is made to expand so as to cover 


the entire infinite lattice in the sense of Van Hove ; if this requirement is 
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satisfied for all possible local observables A, then the mapping A — (A) 


defines an infinite-volume Gibbs state in the sense defined above. 


While making the above statement, we have assumed that 7 specifies a 
certain configurational boundary condition (see sec. 6.2.5.1). More gen- 
erally, we can choose a sequence of boundary conditions as the set A is 
made to expand in successive stages. The question as to how the map- 
ping A — (A) arrived at through the above limiting procedure depends 
on the sequence of boundary conditions chosen is an interesting one and 
does not have an exhaustive answer. However, this possible dependence 
on the sequence of boundary conditions chosen provides the basis for 
the possible multiplicity of Gibbs distributions defined for an infinitely 
extended lattice without reference to limiting transitions through finite 


subsets A. 


In the above, we started from probability distributions over spin configu- 
rations of finite sets (A) of lattice points and then arrived at the concept of 
(infinite volume) Gibbs states through the limits of expectation values of 
local observables. The reverse procedure is also meaningful and worth- 
while, though it is based on a number of mathematical pre-requisites 
relating to the notion of convergence among the space of probability 
mesures, which we will avoid. In brief, one can define a probability distri- 
bution over the set of spin configurations of an infinitely extended lattice 
(we will refer to this set as 2) and then associate, with each local observ- 
able A, an expectation value (A) (a normalized positive linear functional) 
as implied by that probability distribution. Probability distributions are 


also referred to as probability measures. Denoting an admissible prob- 


573 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


ability measure by yw (recall that a probability distribution pertaining to 
a finite set A was denoted earlier by Py), the expectation value of a local 


observable A under the measure ,, can be written as (A) = p(A). 


Corresponding to the fact that there may exist, under certain circum- 
stances, more than one infinite volume states, one encounters the pos- 
sibility of more than one probability distributions (defined in appropriate 
mathematical terms and referred to as ‘Gibbs measures’) pertaining to 


the infinitely extended lattice. 


In accordance with what is outlined in the present section, the terms Gibbs 
state, Gibbs distribution, and Gibbs measure will be used interchangeably. 
Usually the context will indicate as to which of the two complementary 


points of view mentioned above is at work. 


While a Gibbs measure (say, ,:) is defined over the set of all possible spin 
configurations in the infinitely extended lattice, one can associate a prob- 
ability with each and every event pertaining to the set 2. For instance, 
the probability of a preassigned spin o (located at an arbitrarily chosen 
origin in the lattice) having the value —1 (an event) may be denoted by 
(09 = —1). Probabilities of events can be expressed in terms of expecta- 


tion values of local observables. 


Recall now the way that expectation values of local observables pertaining 
to a Gibbs state were arrived at as limits (A — oo) of expectation values 
under probability distributions over (, (for finite A), defined for some 


specified succession of boundary conditions (or for some specified con- 
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figurational boundary condition). We consider, in particular, two special 
states, corresponding to mappings denoted by A > (A)t and A = (A), 
arrived at under limiting procedures subject to the ‘+’ and ‘-’ boundary 
conditions (7,,7_) respectively. It turns out that these two states are dis- 
tinct under certain conditions (i.e., values of the parameters (6, h) but are 


identical under others. 


For instance, referring to the result (6-89), one obtains distinct states for 
B > Bc,h = 0, while other combinations of values of 8,h correspond to a 


unique state each. 


We will denote the Gibbs measures corresponding to these two states by 
pit, uw respectively. Each of these implies well defined values for proba- 
bilities of events that can be related to expectation values of local observ- 


ables. 


For instance, considering the event of the spin op having a value —1, one 


has the relation (expected on physical grounds) 


1 [09 = 1] = 5(1 — (o0)*). (6-96) 


As mentioned in earlier paragraphs, Gibbs measures such as + and p~ 
can be defined directly on 2, the set of configurations in the infinitely 
extended lattice without reference to the limiting procedure alluded to 
above. One encounters in this way a rich structure of the set of Gibbs 


states, which we briefly elucidate below. 
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Translationally invariant Gibbs states 


In particular, the notion of translation invariance of a Gibbs state (or, 
equivalently, of a Gibbs measure) is of fundamental relevance (refer back 
to sec. 5.6.2). Given any set of lattice points A, one can define a set A, 
obtained by shifting every spin in A by some specified lattice translation 
T. Correspondingly, given a local observable A, one obtains a translated 
observable A, such that the value assumed by A on any specified set 
of spin variables is the same as the value assumed by 4A, on the set of 
spins resulting from the translation +. A Gibbs state (corresponding to 
the mapping A — (A)) is then said to be a translationally invariant one if, 


for every such translation 7, (A) = (4,). 


It turns out that the measures p* (which may correspond to distinct 


states under certain conditions) introduced above are translationally in- 


variant. Analogous to the measures ji*, one can also construct a Gibbs 
measure ji by a limiting procedure on finite volume distributions, where 
the transition to the thermodynamic limit is effected under the free bound- 
ary condition. Once again, i" describes a translationally invariant Gibbs 


state. 


The reference to the boundary conditions (‘+’, ‘-’, or ‘F’) in the Gibbs states 
seems to indicate that the definition of these states depends on a construc- 
tion involving a limiting procedure under specified successions of boundary 
conditions. However, the same states can also be constructed in terms of 
infinite volume measures without direct reference to limits of finite volume 


probability distributions. The correspondence between the two ways of con- 
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structing Gibbs states becomes clear when one refers to the DLR approach 


to Gibbs states, to be briefly outlined in sec. 6.2.5.5. 


6.2.5.4 Phase transitions and the phase diagram: results in sum- 


mary 


In the present chapter we have introduced the nearest neighbour Ising 
model in one, two and higher dimensions. We have seen by an appli- 
cation of the transfer matrix method that the 1-dim Ising model does 
not admit of a phase transition, in conformity with general results on 1- 
dim models with short range interactions, but the two dimensional model 
does show a phase transition at a value 8, of the parameter 6 = (kgT)~! 
given by (6-60b), where one finds by the transfer matrix method that, 
in the absence of a magnetic field (h = 0), the specific free energy f be- 
comes non-analytic (a related result is that the left hand and right hand 
derivatives of the pressure with the field strength h become unequal for 
h — 0), as indicated by a logarithmic singularity in the specific heat (see 
fig. 6-12). The low-temperature representation following from Peierls’ ar- 
gument (sec. 6.2.4.3) implies the existence of a phase transition at some 
sufficiently low temperature for any dimension D > 2. It turns out that £6, 


is non-increasing in the dimension D. 


The phase transition appears only as one goes over to the thermodynamic 
limit when the thermodynamic variables such as the specific free energy 
and the pressure depend on (,/ in a manner independent of the bound- 
ary conditions that affect the Gibbs distribution for a finite system (i.e., 


spins at a finite set A of lattice points). In other words, the existence of a 
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phase transition and values of the thermodynamic variables such as f,p 
do not depend on the succession of boundary conditions as A is made 
to approach the thermodynamic limit (invading the entire lattice in the 
sense of Van Hove). However, the states that emerge in the thermody- 
namic limit for 6 > 6. are sensitive to the boundary conditions through 


which the thermodynamic limit is achieved. 


States for an infinitely extended system can be defined in two comple- 
mentary and equivalent ways, as outlined in sec. 6.2.5.3. Of these, the 
approach in terms of expectation values of local observables makes use 
of finite volume probability distributions under specified boundary con- 
ditions and and then looks at the thermodynamic limit. In principle, the 
boundary condition may possibly have a determining role at the infinite 
volume distribution arrived at, and various different boundary conditions 
may lead to multiple Gibbs states for the system. This indeed is what hap- 
pens for 6 > 6.,h = 0 for which we have already encountered two distinct 


states corresponding to the ‘+’ and the ‘-’ boundary conditions (7), while 


a third state (in addition to other possible ones) under the free boundary 


condition (7) is also possibel. 


Indeed, the Gibbs states can possibly have a rich structure that can be 
more fruitfully revealed when one adopts the other equivalent approach to 
arrive at infinite volume Gibbs states where the latter are defined without 
overt reference to the thermodynamic limit approached through a suc- 
cession of finite volumes. There exists a correspondence between states 
defined in terms of expectation values of local observables (whereby map- 


pings of a definite kind, A — (A), are established) and infinite volume 
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Gibbs measures (). In particular, Gibbs measures p* and p" exist, cor- 


responding to the states alluded to above, of which the measures p* will 
be seen below (sec. 6.2.5.5) to have a special significance in virtue of being 


extremal states — one that correspond to pure phases. 


Gibbs states (and, correspondingly, Gibbs measures) can be grouped into 


those that are translationally invariant, and ones that are not. In par- 


ticular, the states described by * and y* are translationally invariant. 
Generally speaking, translationally invariant states correspond to pure 
phases and statistical mixtures of pure phases. Coexisting phases, on 
the other hand, correspond to translationally non-invariant states where 


there appear surfaces of separation between different phases. 


Fig. 6-13 gives the phase diagram of the Ising model in dimension D > 2. 
There occurs a continuous phase transition at T = T.,h = 0 (the criti- 
cal point of the phase transition); as one crosses this point along the 
temperature axis from T > T, to T < T, the unique Gibbs state with Mag- 
netization (m) = 0 splits into two (at T = 7.,h = 0, the magnetization 
is zero), one of which is approached as h is made to tend to zero from 
above while the other gets selected as h is made to approach zero value 
from below (refer to fig. 6-2, which depicts these features in the context 
of the mean field theory). If, on the other hand, h is held constant at the 
value 0, one may have a host of different equilibrium states in addition 
to the two mentioned above. These are encountered as one tries to con- 
struct Gibbs states for the infinitely extended system through the DLR 


approach outlined in sec. 6.2.5.5. 
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unique 


non-unique unique 


unique 


Figure 6-13: The phase diagram of the Ising model in dimension D > 2; infinite vol- 
ume states (the Gibbs states) depend on the parameters ',h; the figure indicates how 
states of qualitatively different types appear for various combinations of T(= (kp@)~') 
and h; the line extending on the temperature axis from T = 0 to the transition tempera- 
ture T, corresponds to multiple, or non-unique, Gibbs states, while all the other points 
in the T-h diagram correspond to unique states each; the temperature T, at h = 0 cor- 
responds to a continuous phase transition (phase transition of the second kind), while 
the transition across the line h = 0,T < T, is of the first kind and is a discontinuous 
one; as T crosses the value T.from above, with h = 0, two distinct states appear (with 
magnetization given by (6-89) in the 2-dim case); for any T < T, and h = 0, the two states 
mentioned above are extremal ones, while a host of other states are also possible. 


The spontaneous magnetization, i.e., the value of (m) as h > 0* distin- 
guishes an ordered state ((m) = +m* for 6 > £6.) from a disordered one 
((m) = 0 for 6 < §,), and is referred to as the order parameter. The mag- 
netization (m)(3,h) is uniquely defined for all h ¢ 0,6 and is an odd and 


non-decreasing function of h, being discontinuous across h = 0. 


I will now briefly outline the DLR approach to the infinite volume Gibbs 
states, and mention a few basic facts relating to the structure of the set 


of these states. 
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6.2.5.5 Gibbs states: the DLR theory 


The Dobrushin-Lanford-Ruelle theory (‘DLR’ theory in brief; refer back 
to sec. 5.6.4 where the DLR approach to Gibbs states is mentioned) of 
Gibbs states is based on a consistency condition that such a state has 
to satisfy. This essentially corresponds to the requirement that, since a 
Gibbs state describes a condition of equilibrium at some given value of 
the parameter (3, every finite set of lattice points has to be in equilibrium 
with every other set that encloses it, for that same value of 3. We sketch 
a brief and informal explanation of the DLR condition in the rest of the 


present section. 


Fig. 6-14 depicts schematically a finite set of lattice points A chosen arbi- 
trarily within the infinitely extended lattice, and a proper subset A within 
A. We consider an arbitrarily chosen configurational boundary condition 
n, Subject to which the finite volume Gibbs distribution over spin config- 
urations in A is denoted by P”, the probability of a spin configuration 
wa under this distribution being given by (6-90a), (6-90b). Considering, 
then, an appropriately defined function A of spin configurations in the in- 
finite lattice, the expectation value of A subject to the boundary condition 


7 on the set A is given by 
(AYR? = D0 A(wanac) Px” (wa), (6-97) 
WA 


where pw (wa) is given by (6-90a), (6-90b). In this expression, w, denotes 
an arbitrarily chosen spin configuration in A while wan,- denotes a spin 


configuration (in the entire lattice) corresponding to the configuration wy 
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within A and the restriction of 7 in A°, the complementary set (A°) of A in 


the infinite lattice. 


The DLR condition requires that the same expectation value can also be 
worked out by referring to the subset A in the following manner. Con- 
sider an arbitrarily chosen configuration w, in A and a configurational 
boundary condition corresponding to the spin configuration wy7,- defined 
above. The expectation value of A over possible spin configurations in A 
subject to this boundary condition is denoted, according to our notation, 
by (Ay wane) Multiplying this with the probability of the chosen configu- 
ration w, in A and summing over all possible w, € Q,, we arrive at the 


required alternative expression for (f){ and hence at the condition 


(AS? = SCADA PR (wa). (6-98) 


WA 


This condition, if satisfied for all possible choices of A, A, and A, and 
for all possible configurations 7, defines a Gibbs state, corresponding to 
some Gibbs measure, say, 4. In other words, among all possible mea- 
sures that can be defined over the set 2 of possible configurations in the 
infinite lattice, the Gibbs measures (ju) representing equilibrium states of 
the system of spins in the infinite lattice for given values of the parame- 
ters (,h (these have been left implied in our considerations above) are the 
ones that satisfy (6-98) for all possible chices of A, A, A, and 7 (recall that 
a Gibbs measure ,: corresponds to a normalized, positive, linear mapping 
A — (A)). For further details of how Gibbs measures can be arrived at 


from (6-97), (6-98), refer to [41]. 
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Figure 6-14: Illustrating the basic idea underlying the DLR consistency condition; 
the finite set A includes the proper subset A, the complementary set of which within A 
is A\A; A° is the complementary set of A in the infinitely extended lattice; the boundary 
condition corresponding to the configuration 7 selects out a Gibbs state in A while, for 
any spin configuration w, in A, wanac (defined in text) provides for a boundary condition 
that selects out a Gibbs state in A; the latter, averaged over all possible configurations 
wa in A, has to be related to the former as in (6-98), for all possible choices of A, A,7 for 
any given choice of {,h. 


Extremal Gibbs measures and their convex combinations. 


For given values of the parameters 3h, the DLR conditions select out the 
set of all possible Gibbs distributions that define equilibrium states of 
the infinite system under consideration (the Ising system of spins in the 
infinite lattice in the present context), and provide the basis on which 
the theory of Gibbs measures is developed, the latter having led to a 
number of deep and far-reaching results. To begin with, the set of Gibbs 
measures for any given (6,h is non-empty. However, as noted in earlier 
sections, the set may contain either one single Gibbs measure or more 


than one such measures, i.e., for given 3,h there can be either a unique 
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equilibrium state of the infinie system or else a number of equilibrium 


states. 


In the case that the Gibbs measure is unique (for instance, uniqueness is 
a common feature, for dimensions D > 2 in the high temperature regime 
{2 — 0 as implied by the high temperature representation introduced in 
sec. 6.2.4.2), the limiting distribution arrived at from finite volume distri- 
butions through a growing sequence of finit sets of lattice points subject 
to a configurational boundary condition (say, 7), does not depend on the 
latter, i.e., is insensitive to the boundary condition. However, there exist 
parameter values for which the thermodynamic limit does involve a sen- 
sitivity to the boundary conditions and these are precisely the ones for 


which multiple Gibbs states exist. 


For the 2-dim Ising model the Gibbs states (i.e., those measures on the 
set of all possible spin configurations in the infinitely extended lattice 
that satisfy the DLR condition) for parameter values 3 > 6.,h = 0 include 
the two special states described by measures *, ~ that appear (in the 
thermodynamic limit, under the ‘+’ and ‘-’ conditions respectively) as ( is 
made to cross the value {, from below by the phenomenon of spontaneous 
symmetry breaking, from the unique disordered state ((m) = 0, for which 
yt = yp). Moreover, any other Gibbs measure p (for 6 > 6.,h = 0) can be 


expressed as a convex combination of + and y~, i.e., in the form 


p=apt+(l-ajy (0<a<1), (6-99) 


(in the case a = 0 (resp. a = 1), p reduces to p (resp. p*)), i.e., the 
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only Gibbs states that cannot be so expressed are p*, ~, which are re- 


ferred to as extremal states. The formula (6-99) expresses the fact that all 


Gibbs states (other than p*) can be expressed as convex combinations of 


extremal states. 


The extremal Gibbs measures are ergodic in the sense that, for either of 
these, the phase space (i.e., the set of all possible spin configurations in 
the infinitely extended lattice) decomposes into two disjoint subsets such 
that the probability, under either of the two extremal measure, for any 
event in one of the two subsets is trivially zero while that of an event in 
the other subset is non-zero, these being referred to as the invariant sets 
corresponding to the extremal state in question. On the other hand, a 
convex combination such as the one given by (6-99) is non-ergodic. The 
two subsets are characterized by the fact that, for each spin configuration 
in one of the two, there exists a spin configuration in the other subset 
related to the former by a global spin flip, i.e., a reversal of the spins 
at all the lattice points. Thus, the phase transition from a disordered 
to an ordered state involves the phenomenon of broken ergodicity and a 


spontaneous symmetry breaking. 


Finally, the extremal states are translationally invariant in the sense de- 
scribed in 6.2.5.3, and so is any convex combination of the form (6-99). 
As mentioned above, any Gibbs state for the 2-dim Ising system is either 
an extremal state or a convex combination of the two extremal states and 


hence, is translationally invariant. 


While an extremal state refers to a pure phase of the system under consid- 
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eration (in the sense that it cannot be decomposed further into extremal 
states), a convex combination of more than One extremal states is a sta- 
tistical mixture of pure phases, i.e, when one samples the configurations 
in the phase space, a fraction of those will be found to belong to one of 
the two invariant sets mentioned above and the rest to the other invariant 
set. It is this fact that explains the translational invariance of a convex 


combination of extremal states. 


As mentioned above, the 2-dim Ising system does not admit of states 
other than the extremal ones and the convex combinations thereof. The 
situation is different in any dimension D > 3, where there exist states 
which are not translationally invariant. These states are of a special type 
in that they are not statistical mixtures of pure phases, but represent co- 
existing phases with surfaces of separation between them, where a sur- 
face of separation is one that can be defined geometrically in macroscopic 
terms. For instance, a state arrived at in the thermodynamic limit from 
finite volume Gibbs measures subject to the Dobrushin boundary con- 
dition (refer to fig. 6-8) is one where the two pure phases are separated 
by a planar surface with microscopic fluctuations. In the case of the 2- 
dim Ising system, on the other hand, the fluctuations are of macroscopic 
proportions and the state based on the Dobrushin boundary condition 


reduces to the statistical mixture given by $u* + $y". 
Gibbs measures: the variational principle. 


Recall the principle of minimum free energy for a finite system stated in 


sec. 2.1.5.3 where the principle was established in the quantum context; 
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the proof goes essentially along the same lines in the classical context as 


well (try this out). 


This principle applies to the Ising system defined on a finite set of lattice 
points A. More precisely, let ~ be an arbitrarily chosen probability dis- 
tribution over the set of spin configurations (Q,) in A, that need not be 
a Gibbs distribution. We look upon the distribution ,, as sepecifying a 
‘non-equilibrium’ state of the system (for given values of the parameters 


6 = (kgT)~',h) and define its ‘entropy’ as 


S(u) = —kp Ss" fu(wa) In (wa), (6- 100a) 


wrjaEcQa 


where w, denotes a typical spin configuration belonging to (,. The ‘inter- 
nal energy’ of the distribution is defined as the expectation value of the 


energy H(w,) under wp: 


UH) = DS) H(wa)u(wa). (6-100b) 


wrpEQa 


If we now seek for a distributuion that minimizes the ‘specific free energy’ 


f(u) defined as 


1 


Fs) = 3 


(U(u) —TS(y)], (6-101) 


then that minimizer will be precisely the finite volume Gibbs distribution 
specifying the equilibrium state of the spin system defined on the set 
of lattice points A (we leave the boundary condition on the system as 


implied). 
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This principle of minimum free energy carries over to the case of the in- 
finitely extended lattice, provided that a number of mathematical issues 
are properly attended to (refer to [41], chapter 6). It turns out that the 
Gibbs measure over the set of spin configuration in the infinite lattice is 
the unique minimizer of the non-equilibrium specific free energy func- 
tional » — f(uw) that can be defined in a manner analogous to (6-101) 
over the set of measures 1, where a typical measure p is defined over the 
(non-denumerably) infinite set of all possible spin cofigurations in the in- 
finitely extended lattice. What is more, the minimizer, selected out by the 
DLR condition, turns out to be a translationally invariant one. Recall that 
the translationally invariant Gibbs measures exhaust the set of all possi- 
ble equilibrium distributions for the 2-dim Ising system. For dimension 
D > 2, on the other hand, there exist Gibbs measures that do not satisfy 


the minimum principle. 


6.3 Landau-Ginzburg theory of phase transi- 


tion 


6.3.1 Landau-Ginzburg theory: introduction 


Phase transitions of the first kind were considered in section 5.7. Phase 
transitions of the second kind appear as limiting transitions where there 
occurs a continuous passage from a disordered to an ordered phase in- 
volving long wavelength fluctuations of the microscopic variables. Among 
all possible types of fluctuations, Landau postulated the existence of 


some particular mode of overriding relevance, defining the order parame- 
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ter for the phase transition under consideration. Making use of the idea 
of the order parameter as a relevant state variable, a remarkably versatile 
theory of phase transitions of the second kind was developed by Landau 
and Ginzburg where one looks for a free energy functional in terms of 
which a great number of features of phase transitions can be explained 
in the form of a coherent theory, referred to as the Landau-Ginzburg (LG) 


theory. 


The free energy functional provides one with a coarse-grained representa- 
tion of the system of interest, since it determines the partition function by 
means of a functional integral involving a single mode whose spatial vari- 
ations over short distances (of the order of the lattice spacing in the case 
of a lattice model) are ignored. This, in effect, is a coarse-graining over 
a scale that is intermediate between the microscopic length scale giving 
the detailed description of a system, and the macroscopic scale describ- 
ing its thermodynamic behavior. Appropriate coarse-grained descriptions 
can always be invoked in addressing critical phenomena since the latter 
are all dominated by long-wavelength fluctuations that are insensitive to 


microscopic details relating to the systems of interest. 


An approximate evaluation of the functional integral can be performed in 
terms of a minimum principle relating to the relevant mode - a procedure 
that defines the Landau mean field approach. The mean field approach 
is incomplete in the sense that it ignores, in effect, the spatial fluctua- 
tions of the order parameter and hence does not correctly describe the 
correlations characterizing the system on either side of the transition. 


The latter are obtained by evaluating the leading corrections to the mean 
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field values, which makes the theory a truly prolific one. Though, from 
a fundamental point of view, it is still phenomenological in nature, yet it 
can be related to microscopic features for numerous systems of interest, 
which is why it now constitutes a tool of major relevance in equilibrium 


and non-equilibrium statistical mechanics. 


The content of the present section and the next ( 6.4) is principally based 


on excellent and authoritative presentations in [67], [140], [99], and [128]. 


In the following we will refer, in the main, to discretely distributed sys- 
tems of spins where a ‘spin’ will stand for a fictitious classical object 
whose possible instantaneous states make up a finite dimensional space. 
We will refer, in particular, to Ising spins where an Ising spin can be in 
any one of only two possible states, described as s = +1 (‘up’ state) and 
s = —1 (‘down’ state), as in sec. 6.2 (refer also to sec. 6.5 below). Though 
fictitious by definition, such assemblies of spins are of great relevance in 
respect of numerous systems of practical interest besides being objects 


of immense heuristic value in statistical mechanics. 


As mentioned above, the Landau-Ginzburg theory makes use of the cen- 
tral idea of an order parameter where, in the case of a spin system, the 
order parameter corresponds to the magnetization field m(x), the aver- 
age spin per unit volume in sites around any specified point x in the 
lattice (the term ‘magnetization’ does not necessarily refer to a magnetic 
property, though it does have such a connotation in the case of a mag- 
netic material represented by a spin system). The significance of an order 


parameter can be explained with reference to the paramagnetic to fer- 
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romagnetic phase transition (once again, the terms ‘paramagnetic’ and 
‘ferromagnetic’ can refer to an actual magnetic system or can be used 
in a figurative sense for model systems defined in more abstract terms 
such as the Ising model). The magnetization distinguishes between the 
paramagnetic phase (m = 0 for h = 0, where h stands for an external field 
parameter as in (6-1)) and the ferromagnetic phase (m 0 for h = 0), and 
is indicative of a spontaneously broken symmetry. More precisely, the 
Ising Hamiltonian with h = 0 is invariant under the operation of global 
spin flip, while the ferromagnetic state corresponding to any one of two 
possible non-zero values of the magnetization m (a scalar in the Ising 
case) is not invariant under the same operation. In the case of a system 
made up of Heisenberg spins, m(x) is a three dimensional vector: the 
Hamiltonian (with zero external field) is invariant under the three dimen- 
sional rotation group, while the state with any specific non-zero value of 


m below the transition temperature is not invariant. 


The spontaneous breakdown of symmetry is associated with long wave- 
length fluctuations of the order parameter where these long wavelength 
fluctuations continuously tip the scale away from a disordered configura- 
tion in favor an ordered one (with a non-zero value of the order parameter) 
in a continuous phase transition as the temperature is made to cross a 
transition value. In the case of a continuous symmetry these fluctuations 
of the order parameter appear as Goldstone modes of vanishingly small 
energy. The ‘tipping of the scale’ is quantitatively described in terms of 


the free energy functional mentioned above, as we outline below. 


Though we organize our discussion by referring to Ising spins on a dis- 
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crete lattice, the Landau-Ginzburg (LG) theory (and also the renormaliza- 
tion group (RG) theory of continuous phase transitions to be outlined in 
sec. 6.4) is of a wide scope and includes continuously distributed systems 
as well. In addition, the spatial dimension of the system can have an ar- 
bitrarily chosen integer value (D) while the order parameter can also have 
any specified number (d) of independent components. In the case of the 
three dimensional Ising model, we have D = 3,d = 1, while the Onsager 
solution pertains to D = 2,d = 1. The gas-liquid transition once again 
belongs to the category D = 3,d = 1 (scalar order parameter specifying 
the difference between the specific volumes of the gaseous and the liquid 
phases). The universality class of a continuous phase transition (refer 
to sec. 6.2.3.4, and also to sec. 6.4 below) is determined by the values 
of D and d. Thus, the phase transition in the three dimensional Ising 
model and the gas-liquid phase transition both belong to the same uni- 
versality class even though the former pertains to a discretely distributed 
system and the latter to a continuous one. Indeed, since a continuous 
phase transition occurs by means of long wavelength fluctuations, it is 
immaterial as to whether the system under consideration is discrete or 
continuous in nature. For both types of systems it suffices to consider 
fluctuations with wavelengths () larger than a certain minimum cut-off 
value, say A where, in the Ising case, A is to be at least of the order of the 


lattice spacing (a). 


6.3.2 LG theory: free energy 


With this background we start from the expression of the partition func- 


tion in terms of a functional integral involving the order parameter m(z), 
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which serves as the basic formula of the LG theory. In outlining the 
theory we refer, at times, to a spin system such as the Ising model for 
making our ideas concrete, though the basic approach is of much wider 
applicability. The partition function, related to the free energy in the 


usual manner, is of the form 
ert = A(T nh) = J Pmoge stein, (6-102a) 


where one needs to explain the notation. In (6-102a), h stands for the 
external field parameter and Hy is referred to as the Landau-Ginzburg 
Hamiltonian, which is expressed as the spatial integral of a ‘free energy 
density’ f (see formula (6-102b) below), where the latter may depend on 
m(x) through its derivatives of various orders. The second equality in the 
above formula constitutes the basic statement of the LG theory. Here { D 
indicates a functional integration over possible forms of functional depen- 
dence of the order parameter m on the position vector x. Referring to any 
arbitrarily specified function m(x), the LG Hamiltonian H,c is expressed 


as a Spatial integral of the form 
Hyg = / d'P lx f (m(x)), (6-102b) 


where f d'"lx indicates D-dimensional spatial integration and where we 
finally need to specify the function f so as to complete the formulation 
of the basic principle underlying the LG theory. This is done by recog- 
nizing that, close to the transition, m(x) is small in magnitude (indeed, 
m(x) = 0 in the disordered phase for h = 0, while it acquires a small non- 


zero value just below the transition temperature) and its spatial variation 
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arises through long wavelength fluctuations. In addition, a number of 
physical constraints, arising from homogeneity and isotropy of the un- 
derlying space and from the requirement of rotational symmetry of the 


Hamiltonian for h = 0, imply the following form, 
Fe 4, 2 
3f=—h-m+ 5m +um + = (Vm) fees, (6-102c) 


where the external field parameter h is a d-dimensional object (thermo- 
dynamic variable conjugate to m), and the dots represent higher order 
terms that may be needed in degenerate situations. The constants 7, u, K 
appearing in (6-102c) are phenomenological in nature that may depend 
on the constitution of the system under consideration and on external 
parameters such as temperature and pressure. Among these, u and Kk 


are required to be positive for global stability of the system. 


1. The functional integral in (6-102a) can be interpreted as the limit of 
an ordinary multiple integral by imagining the underlying space to be 
replaced with a lattice of discretely distributed points identified with 
a d-component sub-index i, and replacing a function m(x) with the 
discrete set of values m; assumed by the function at the lattice points. 
In the simplest case of d = 1, assuming the lattice spacing to be a, one 


obtains, for any specified function ¢(m(x), Vm(x),:---), 


N 
[Pets o0m(s), mts)--) = | dip. f [Lamon MRM.) 
(6-103) 

2. Though we refer to h and m as the ‘magnetic field’ and the ‘mean 


magnetic moment density’, their dimensions are left unspecified in the 
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following. The dimensions of the constants 7,u, and K are left similarly 
unspecified. This leaves the door open for identifying these quantities 
with physical variables and parameters pertaining to diverse systems 


that can be described in terms of the LG theory. 


In order to obtain the free energy F' for arbitrarily specified values of the 
parameters 7,h, and then to evaluate thermodynamic functions like the 
magnetization (the order parameter characterizing the transition), inter- 
nal energy, entropy, and susceptibility, one needs to workout the func- 
tional integral in (6-102a). This, in general, is a prohibitive task but an 
approximation can be obtained by invoking the saddle-point method (see, 


for instance, [74], chapter 2) that gives 
Z =P? ~ exp (— 6[Hic]min), (6-104) 


where, for specified values of T(= (kp3)~') and h, [Hic]min Stands for the 
minimum value of H,c with reference to variations of m(x). For posi- 
tive kK, the minimum necessarily corresponds to a uniform magnetization 
m(x) = constant ({ dxf can be made arbitrarily large by choosing m(x) to 


have a sufficiently short-range spatial variation), which implies that 
[Aia}min = i: d'"lxf(m), (6-105a) 
where f is obtained by minimizing the Landau free energy density 
f =—h-m+ 5m’ + um‘, (6-105b) 


and m is the magnetization that minimizes f. The free energy F is thereby 
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approximated as 
F=VvlMlf, (6-105c) 


where V!! stands for the D-dimensional volume that goes to infinity in 
the thermodynamic limit, and f is obtained as the solution of 


Of _ 


dm 


0. (6- 105d) 


Strictly speaking, the application of the saddle-point method yields for- 
mula (6-104) with a pre-factor on the right hand side; this, however, can 
be ignored in the thermodynamic limit in working out the thermodynamic 
functions since these are all obtained from the logarithm of 7 by differenti- 


ation. 


If this condition of stationarity is satisfied for some m = m, then the 
condition that this corresponds to a minimum value of f is that all the 
eigenvalues of the d x d Hessian matrix MM are to be positive, where M is 
given by 

of 


Mi = Wie (7) = 2 saath) (6-105e) 


6.3.3 LG theory: thermodynamic parameters 


The equilibrium properties of the system under consideration are ob- 
tained from the Landau free energy density f which defines the Landau 


mean field theory. We outline below a number of conclusions that follow 
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from this theory. 


1. For 7 > 0, there is only one solution to (6-105d) in the presence of a 
non-zero field, say, h = hé, where é is any specified unit vector and 


h > 0. The unique solution is given by 


(6-106) 


5 
d 
® 


Als 


This corresponds to a stable equilibrium configuration since the 
Hessian matrix possesses d number of degenerate eigenvalues \ = 7, 
all positive. Since the magnetization vanishes for h — 0, there is no 
spontaneous magnetization, and the equilibrium corresponds to a 


paramagnetic phase. 


2. For 7 < 0, we first consider a situation with zero external field (h = 0), 
when one obtains a single stationary solution (m = 0) which actually 
corresponds to a maximum of the free energy density f (eigenvalues 
of M degenerate and negative), while there arises a continuum of 


degenerate minima, all with the same value of |m 


imma] ~ V-a (7 <0). (6-107) 


For an equilibrium configuration corresponding to any one of these 


, namely, 


degenerate solutions, the d-dimensional rotational symmetry built 
into the Hamiltonian (6-102b) is broken spontaneously, and the 
non-zero magnetization in the absence of an external field is indica- 


tive of a continuous (second order) phase transition. The variation of 
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the functional form of the free energy density f is depicted schemat- 


ically in figure 6-15. 


If one now assumes a small non-zero value of the external field 
(h = hé) with 7 < 0 (|r|? << h, see below), one obtains, apart from 
the solution m = 0 corresponding to a maximum of /, two other 


solutions given by 
m= [+,/— — —]é, (6-108) 


of which the lower sign corresponds to a local minimum and the up- 
per sign to a global minimum, the latter corresponding to the stable 
equilibrium configuration. Here the magnetization does not result 
from a spontaneous breakdown of symmetry but by a breaking of 


symmetry resulting from the external field. 


One can interpret the above results by saying that there occurs a phase 
transition of the second kind for h = 0, analogous to one outlined in 
sections 6.2.4 and 6.2.5, and identify + as a parameter related to the 


transition temperature JT. as 


(6-109) 


where 7) is a positive constant. The constants u, Kk, on the other hand, 
remain finite and positive as T — T. under this interpretation. The tran- 
sition becomes one of the first kind in the presence of a small non-zero h, 


as h is made to vary across 0 , and the critical point of the transition is at 


598 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


Figure 6-15: Depicting the functional form of the Landau free energy density f (refer 
to (6-105b); schematic); (A) 7 > 0: the solid line depicts the variation of i with m, the 
magnetization, along any chosen direction in the spin space for h = 0; the free energy 
is minimum for m(= |m|) = 0; the dotted curve depicts the variation for non-zero value 
of the field parameter h(> 0), where h = hé; the minimum of the free energy now gets 
shifted to a non-zero value of the magnetization along é; (B) 7 < 0: the solid line again 
depicts the variation of f for h = 0, when m = 0 corresponds to a local maximum rather 
than a minimum, while two degenerate minima (belonging to a continuously varying set) 
appear in accordance with (6-107); for h 4 0 (dotted line), one of the two minima (that 
now appear for any specified direction é) lies at a higher value of the free energy than 
the other, which now becomes an absolute minimum. 


T =T,h =0 (refer to fig. 6-13). 


The equilibrium values of the thermodynamic quantities such as the inter- 
nal energy, entropy, and the mean magnetization, are obtained in the LG 
theory without reference to fluctuations in m(x) over m which, close to crit- 
ical point, are long wavelength ones. On the other hand, spin correlations 
are determined in terms of such fluctuations. Results obtained on the basis 
of the saddle-point approximation are, in some texts, referred to as defin- 
ing the Landau theory of phase transitions, while those obtained by taking 
into account the fluctuations are said to define the Landau-Ginzburg theory. 
This demarcation between the two areas of discourse is nothing more than 


a convention. 
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With this identification, one can work out the critical exponents pertaining 
to the Landau theory, that come out to be the same as those obtained in 


the mean field theory in sec. 6.2.3.4. 


Thermodynamic parameters typically have a non-analytic power law depen- 
dence on 7 close to the critical point in a second order phase transition. The 
critical exponents pertaining to these parameters are defined in terms of the 


corresponding powers of r. 


Thus, with h = 0 and T less than but close to T., one has, from (6-107), 


m~ (T,—T)3, 
1 


(note that the critical exponent here is denoted by ( in deference to 
common usage, and is not to be confused with the inverse temperature 


6B = (kpT)’). 


Making use of the equilibrium values of m for 7 > 0 and 7 < 0 in the zero 
field case, and the relations (6-105b), (6-105c) (recall that f = 7 |m—m), one 
obtains the free energy (F = —6~'ln Z) close to T, 


VY 5 


[for T > T.] F=0, [for T < T,]| F = -———r’. 
lou 


(6-110b) 
Here we have not taken into account a part of the free energy that is analytic 


in the temperature around T., since this part is of no relevance in describing 


the characteristics of the phase transition. 
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One can then work out the internal energy U = I In Z, and the specific 


heat per unit volume c(= oC) to obtain, for T close to T., 


k; 
fora > 7) c=, ffor P< T.]e=e+ 5 
u 


(Lé.,) @=0, a =0. (6-110c) 


In the above relations, ¢c stands for that part of the specific heat that re- 
sults from the non-singular part of F (not included in (6-110b)), which 
is not of relevance in the present context. Though the critical exponents 
a,a’ describing the behavior of the specific heat are both zero, there oc- 


curs a finite discontinuity at T = T.. 


The susceptibility x = ae for T ~ T., and the corresponding critical 
exponents on two sides of the transition are given by (refer to (6-106), 


(6-108)), 


1 1 
[for T > Ti] x~ —, [for T < Ti] x ~ -— 
: 2T 


ie,) y= 1,4 =], (6- 110d) 


Finally, putting 7 = 0 in (6-105b), one obtains the equation of the critical 


isotherm, and the corresponding critical exponent 6 (compare with (6-42b)): 
)?, (ie.,) d=3. (6-110e) 


All these values of the critical exponents constitute an exact match with 
corresponding results in sec. 6.2.3.4, where we worked out the critical 


exponents of the Ising model in the mean field approximation. As we 
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see from these results, these critical exponents do not depend on the 
dimensions D and d (since a phase transition is ruled out for D = 1, this 
case is excluded from our considerations; refer back to sec. 5.7.3), and 


hence do not distinguish between universality classes. 


In the case of the nearest neighbor Ising model, the Landau free energy (6-105c) 
(with h expressed in terms of m) is obtained from the mean field expres- 
sion (6-37) by expanding the latter in a close vicinity of the critical point 
and retaining the leading terms. This relation between the mean field the- 
ory and the Landau theory obtains for systems of various other descriptions 


as well. 


6.3.4 LG theory: correlations 


The saddle-point evaluation of the free energy gives the leading approx- 
imation for a system of large size. Leading corrections to this approxi- 
mation are obtained by taking into account the fluctuations of the order 
parameter in a manner such that the partition function appears as a 


Gaussian functional integral. 


We start from the basic result for a Gaussian integral involving N number 


of real variable ¢; (i = 1,2,...,.N): 


oo (ON 
Ig / [| 44: exp [ - 5G +h. @] = [det(2nG)]? exp [Aon], 
= O04 = 


(6-111) 


where ¢ and h stand for N-dim column vectors while G~! is the inverse of 
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an N x N matrix G, i.e., G,G~' are operators in the space of N-vectors. 


In the limit when the above integral goes over to a functional integral in the 
space of functions ¢(x), G,G~! will represent appropriate operators in that 


space. 


If the integrand exp | — $¢7G~'¢ + h- ¢] represents the (unnormalized) joint 
probability distribution of random variables ¢; (i = 1,2,...,N) (in which 
case Zy stands for the normalization factor for the distribution), then the 


cumulants of order 1 and 2 are obtained as 
(bie = > Gihy, (bidje = Gi (i, 9 =1,2,--- , N), (6-112) 
] 
while cumulants of all higher orders evaluate to zero. 


1. The formulae (6-111), (6-112) are obtained by diagonalizing the quadratic 
expression [—$¢7G~!¢ +h- 4] and evaluating the resulting product of 
Gaussian integrals. 


2. The cumulants are connected correlation coefficients of the random 


variables such as 


(Pic = (gi), (ibs )c = (di0;) 3 (di) (5), eee (ted <—. 1, 2, ~s, ,N), 
(6-113) 


where (-) represents the expectation value relative to the probability 
distribution under consideration. The connected correlations for a set 


of uncorrelated random variables are all zero. 


A Gaussian functional integral can be looked upon as a limiting case of 
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an integral of the form (6-111) where the discrete index i is interpreted as 
one labeling the sites of a D-dimensional lattice and the lattice constant 
is made to go to zero. In this limit, the column vectors ¢ and h go over 
to functions ¢(x), h(x) of D-dimensional position vector zx (one may also 
consider a constant function h(x) = h), while the matrices G,G~' appear 


as operators G(x,x’),G7!(x,x’), satisfying 
[arxore x )G(x',x”) = dPl(x — x"), (6-114) 


where 6!”! represents the D-dimensional delta-function. 


Looking back at the idea underlying the LG theory, the function m(x) 
defining the space-dependent order parameter corresponds to a state 
of the system under consideration near the critical point in the coarse- 
grained description that the theory considers as relevant. In this descrip- 
tion the functional Hyg of (6-102a) specifies the effective Hamiltonian as 
a state function (for given values of T,h; the temperature enters into the 
effective Hamiltonian as a parameter as a result of the coarse-graining 
over the irrelevant state variables) and correspondingly, Z appears as 
the partition function, with the probabilities of the various states (each 
specified by a function m(x)) fixed in accordance with the Gibbs canon- 
ical distribution. What we did in sec. 6.3.2 in our attempt to evaluate 
Z was to adopt the saddle-point method that singled out the states de- 
scribed by constant functions m(x) = m in determining the equilibrium 
state, thereby ignoring the spatial fluctuations of the order parameter, the 
latter being expressions of correlations among the microscopic state vari- 


ables. If one desires to include correlations on all length scales, one again 
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faces a prohibitive task, but a more modest and practical aim would be 
to include only the relevant correlations, i.e., ones corresponding to long 
wavelength fluctuations of the order parameter around the homogeneous 


configuration. 


This is done by including the term depending on V - m in the effective 
Hamiltonian H,, and effecting a quadratic approximation, whereby the 
partition function reduces to a Gaussian functional integral that can be 
looked upon as a generalization of the discrete Gaussian integral of the 
form (6-111). In other words, we interpret the Gaussian-distributed dis- 
crete variables ¢; (i = 1,2,..., N) as state variables of a system (a fictitious 
one at this stage), for which Zy is the partition function and the relevant 


correlations are given by the second equality in (6-112). 


With this background, we now consider our system of interest close to a 
critical point and go over to the continuum limit constituting the general- 
ization referred to above, where the matrix with elements G~';; now goes 


over to the operator represented by G~!(x,x’). 


In the equilibrium state one has m(xz) = mé, where é is a unit vector in 
the d-dimensional spin-space singled out by broken symmetry or by an 
infinitesimally weak external field of strength h — 0. We refer to é as the 
unit vector in the longitudinal direction while there remain d — 1 number 
of transverse directions along unit vectors, say, é, (r = 1,2,...,d— 1). 


One can then introduce a set of fluctuating state variables ¢(x), ¢,(x) (r = 
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1,2,...,d—1) in terms of which m(x) can be expressed as 


m(x) = (m+ o(x))é+ yo br (x) Ep. (6-115a) 


The quadratic approximation in the LG Hamiltonian Hy ¢ ( (6-102b), (6-102c)) 


is effected by making the following replacements 


(Vm)? = (V¢)*? + (V~)?, m? = m? + 2nd + 6? + Y?, 


m’* = m+ 4m3¢ + 6m7¢? + 2m, (6-115b) 
where the scalar field 7)(x) in the spin space is defined by 


d-1 
ey G, (6-115¢) 
r=1 


and where terms of degree three or higher in ¢,v, (r = 1,2,...,d— 1) 
have been ignored. The symbol V appearing in (6-115b) stands for the 
D-dimensional gradient operator. The above prescription leads to the fol- 


lowing expression involving the quadratic approximation of the effective 


Hamiltonian, 

Big =BV Elf + i dl ix[= (V6)? seer i dP (Vy)? + 
ppeiga F Ai 2% D* 2, v : 
—Byllf je x5 [(V) “l+ fd x5 [(Vv) + al: (6-116a) 


(check this out; note that there is no linear term in ¢,7 in either of the 
integrands on the right hand side of the first equality in virtue of the fact 


that m minimizes the Landau free energy density f of (6-105b)). In writing 
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the second equality above we have defined 


3q iq 
- 7 6-116b 
fi T+ 12um?’ et T + 4um? ( ) 
Let us now consider a Gaussian functional integral of the form 
C= | Do(xjen" (6-117) 


which can be interpreted as a partition function defined in terms of order 
parameter ¢(x) whose probability distribution is proportional to e~‘!|, the 


functional /|¢|] being of the form 
I[¢] = if ali [(Ve)? + e), (6-117b) 
This can be expressed in the form 
116] = 5 f dP xd”!x!6(x)G-\x,2¢)0(X), (6-117¢) 
with the operator Kernel G~'(x,x’) given by 
Go4(x,x’) = Kd!PI(x — x’)(-V? + €), (6-117d) 


(check this out by invoking integration by parts; 6!?! stands for the D- 
dimensional delta function). Looking at (6-114), one then finds that 


G(x,x’), which depends on x,x’ only through the difference x — x’, gets 
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defined as 


K(-V? + €-7)G(x) = 6! (x). (6-117e) 


G(x) thus appears as the Green’s function for the operator K(V? — €~”). 


It is often convenient to make a Fourier transformation from the ‘real 
space’ co-ordinate x to the new co-ordinate k, such that the Fourier trans- 


form of the function ¢(x) is given by 


#(a) = / dl lxeF4*b(x), (6-118a) 


for which the inverse transformation reads 


860) = az | dae 04a). (6-118b) 


(27) 

Here we have adopted the convention of denoting a function of the trans- 
formed variable q by the same symbol as that used to denote the corre- 
sponding function of the real space variable x, though the functional depen- 
dence differs in the two cases; q is commonly referred to as the co-ordinate 


in the Fourier space, or the reciprocal space. 


One obtains the correlation (¢(x)¢(x’)), by generalization of (6-112) to the 


continuum case, as 


(O(x)6(x')) = G(x — x’), (6-119) 
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(note that (¢(x)) = 0, since there is no linear term in the integrand on the 
right hand side of (6-117b)). Looking back to (6-116a), one can now apply 
the above result, with 6H|¢,7| replacing /|¢|, to obtain the correlations 
for the longitudinal and transverse fluctuations around the equilibrium 
magnetization m. It is convenient to express the correlations in terms of 


the reciprocal space variable q, and write our final results as 


($(q)d(q')) = (20)? 5'"I(q + q')Gi(q), (W(a)d(q’)) = (27)?5'(q + q')G4(q), 
(6-120a) 


where the longitudinal and transverse Green’s functions are given by (re- 


fer to (6-117d)) 


Gy'(q) = K(q’ + &), Go'(q) = K(a’ + &). (6-120b) 


I; The two-point correlations can be determined experimentally from scat- 


tering data obtained with appropriate probe beams. 


2. The correlations given by (6-120a), (6-120b) are said to be of the Lorentzian 


form. 


The critical exponents for the longitudinal and lateral correlation lengths 
can now be obtained from (6-116b), making use of the result (6-107) 


(recall that m = 0 for 7 > 0 in zero field). Thus, 


&? wr (r>0), &? ~ —2r (7 <0) 


Geer (tS 0) 6 SH <i (6-121) 
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The correlation lengths diverge for 7 — 0, telling us that, close to the crit- 
ical point, long wavelength fluctuations are indeed dominant. The result 
€,° =0 for r < 0 relates to the existence of transverse Goldstone modes be- 
low the critical transition. Such modes appear in the case of broken sym- 
metry under a group of continuous transformations; in the case of a spin 
system the group is made up of rotations in the (d—1)-dimensional space 


spanned by the transverse unit vectors é, (r = 1,2,...,d—1) of (6-115a). 


Expressing the divergence of the longitudinal correlation length for 7 — 0 


as  ~ 7 ’+ (where the sub-indices ‘+’, ‘—’ refer to T > T., T < T, respec- 


tively), one obtains vz = ;. The exponent for the transverse correlation 


length is 4 for T > T., while €>' = 0 for all T < T.. 


symbol v for the critical exponent introduced above need not be confused 
with the same symbol used to denote the number of nearest neighbors in a 


lattice model. 


While (6- 120a), (6-120b) give the correlations in terms of Fourier-transformed 
variables, one can transform back to obtain the correlations in real space. 
Close to the critical point the longitudinal and transverse correlations 
C(x — x’)(i-e., (¢(x)G(x’)) or (w(x)w(x’)) as the case may be; the two are 
independent of each other, i.e., there is no cross-correlation), vary with 


|x — x’| in the near and far zones (|x — x’| << € and |x — x’| >> €, where € is 
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the relevant correlation length) as follows ([128]) 


(6-122) 


The exponential decay in the far zone explains why € (& or & as the case 
may be) is referred to as the correlation length. Precisely at T = T,, when 
the correlation length becomes infinitely large, the above formula tells us 


that the correlations decay with spatial separation as RxD 


In summary, the LG theory includes the mean field theory as the leading 
order of approximation and, at the same time, yields approximate expres- 
sions for the two-point correlation functions relating to the longitudinal 


and transverse fluctuations of the order parameter. 


As for the critical exponents (a, 3,7, 6,v) derived in the above paragraphs, 
the LG theory points to a remarkable universality across thermodynamic 
systems of various descriptions undergoing continuous phase transitions, 
close to their respective critical points since, as we have seen, the ex- 
ponents do not depend on the spatial dimension D or the dimension d 
defining the order parameter. The LG theory is indeed applicable to a 
great variety of systems but, on the flip side, the values of the critical 
exponents are found to differ from those measured experimentally for 
various systems of interest. The measured exponents also exhibit a uni- 
versality, but one of a kind different than that implied by the LG theory, 
since experimental values are found to fall in a number of universality 


classes defined jointly by the dimensions D and d. An instance of real-life 
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systems for which the critical exponents predicted by the LG theory agree 
with experimentally measured values is provided by superconductors (see 


sec. 7.6.6 later in this book). 


To recapitulate, the formulation of the LG theory for any given system 
close to its critical point is made up of the following steps: (a) setting up 
an appropriate spatially varying order parameter with values of D and 
d determined by the nature of the system, (b) setting up the effective 
Hamiltonian as a functional of the order parameter (c) making use of the 
saddle-point approximation to obtain the equilibrium value of the spa- 
tially homogeneous order parameter (given that the system under con- 
sideration is a spatially homogeneous one) (d) working out the Gaussian 
corrections to the saddle-point approximation and obtaining the relevant 


correlations pertaining to the fluctuations of the order parameter. 


While the equilibrium order parameter obtained in the saddle-point ap- 
proximation relates to ther thermodynamic parameters of the system, 
the Gaussian corrections provide us with important information regard- 
ing the structure of the fluctuations and the response of the system to 
probe fields. In the present context, the probe field h has been chosen 
to be time-independent (and spatially homogeneous, though inhomoge- 
neous probe fields can also be accommodated) that induces a constant 
shift in the equilibrium state, from which one obtains only the static re- 
sponse functions, while dynamic response can be obtained by including 
time-varying probe fields. Generally speaking, the response functions are 
related to the correlations (see section 8.4 in chapter 8). In the static case 


considered in the above paragraphs, the susceptibilities ,;, ., constitute 
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the response functions, related to the correlation functions (see [128], 


chapters 1, 2) as 


=e i dl? lx(6(x)9(0)), xe = B | dlPlxc(4h(x)b(0)), (6-123) 


where the sub-indices ‘I’, ‘t’ refer to the longitudinal and transverse re- 
sponse respectively. The longitudinal response is defined in terms of the 
equilibrium magnetization induced in the direction of the field by the for- 
mula y; = om while the transverse response is defined in terms of the 
fluctuations around the equilibrium value and is obtained from the sec- 
ond formula in (6-123) on working out the Gaussian corrections to the 


saddle-point approximation. 


Another response function of interest is the specific heat which, once 
again depends sensitively on the correlations. As we saw in (6-110c), the 
specific heat, worked out from the saddle-point approximation to the free 
energy, is characterized by a discontinuity across the transition. This 
discontinuity, however, is crucially dependent on the correlations. On 
including the Gaussian corretions, the the free energy, whose saddle- 
point approximation given in (6-110b) is denoted below by F (recall that 
the free energy has an analytic part that has not been included in our 


considerations), works out to ([128]) 


pas [A ince +6) +(d— DK? +))). (6-124 


One can work out from this the discontinuity in the specific heat as was 


done in (6-110c) in the saddle-point approximation, but now an anomaly 
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shows up in the mean field theory. It turns out that, when the Gaussian 
correction is included, the discontinuity in the specific heat becomes de- 
pendent on the dimension D of the ambient space: for D > 4 the disconti- 
nuity is finite as in (6-110c), while there appears a divergence in the dis- 
continuity for D < 4. Indeed, as one includes corrections of higher orders 
beyond the Gaussian approximation, there appears further modification, 
which tells us that the mean field results are sensitive to fluctuations for 
D <4. One refers to D = 4 as the upper critical dimension (refer back 
to sections 6.2.3.4 and 6.2.3.5) since, for D < 4, the mean field theory 
gives fundamentally wrong results while, for D > 4, mean field results are 


immune to corrections resulting from the correlations. 


The relevance of the fluctuations for D < 4 is seen on comparing the 
Gaussian-corrected discontinuity (Ac) in the specific heat with the dis- 
continuity (Ac) obtained from (6-110b), and is expressed in the form of 
an inequality (referred to as the Ginzburg criterion) that tells us that the 


role of fluctuations can be ignored only if ([67], [114] 


(EL)? cea (6-125) 
G 


where 7c, termed the Ginzburg reduced temperature, is a positive con- 
stant characteristic of the system under consideration. This condition 


cannot be satisfied close to the critical point unless D > 4. 


The renormalization group (RG) theory, outlined in 6.4 below, supple- 


ments and improves upon the results obtained from the LG theory. 
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6.4 Phase transitions and critical phenomena: 


the renormalization group approach 


The content of this section is based principally on [140], [99], [114], and 


[128]. 


6.4.1 Critical phenomena: outline 


Critical phenomena and critical exponents have already been referred to 
in sections 6.2.3.4, 6.2.4, 6.3.3 and 6.3.4, and will feature in several 
sections later in this book. We summarize here a number of empiri- 
cal features relating to critical phenomena and critical exponents before 
outlining the renormalization group approach to critical phenomena in 


sec. 6.4.3 below. 


The most well-known instance of a continuous phase transition and the 
associated critical point relates to the gas-liquid transition (say, from wa- 
ter vapor to liquid water), where the critical point can be identified on the 
critical isotherm as the one marked A’B’C’ in fig. 4-6 (in the present con- 
text, one is to imagine that the looped portion BCPDE of the lowest van 
der Waals isotherm is replaced with the straight portion(dotted) BPE), the 
critical point B’ being the point of inflexion on this isotherm. Close to the 
critical point, the thermodynamic parameters typically exhibit power law 
behavior in |T — T,| with characteristic exponents indicating a divergence 


in some of the parameters while some others approach zero value. 


In the figure mentioned above, the variable plotted along the abscissa is 
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the volume of a specified quantity of the material under consideration; one 
commonly considers a unit mass of the material, in which case the symbol 
V is to be replaced with v, the specific volume (at times, the term specific 
volume refers to one mole of the material); the density p is the inverse of the 


specific volume. 


The power law behavior defining the corresponding critical exponents 
(a, 8,7, 6; the exponent { is not to be confused with the inverse tempera- 
ture (kgT)~') for a number of relevant thermodynamic variables relating 


to the gas-liquid phase transition are listed below 


(specific heat at constant volume :) c~ |T — T.|~° 
(density difference between phases :) py — pa ~ |T — T.|? 
(isothermal compressibility :) Kr ~ |T — T.|~7 


(pressure variation on critical isotherm :) p — pe ~ |p — pel®sign(p — pc). 


(6-126a) 


Of the thermodynamic parameters listed above, the density difference 
pt — pa possesses the significance of being the order parameter charac- 
terizing the transition because it distinguishes between the two phases 
involved, while c and «7 are in the nature of response functions that 
can be expressed in terms of correlations within the system (refer to sec- 
tion 4.5.3 where thermodynamic parameters of a liquid are related to the 


pair correlation function). 


Generally speaking, the order parameter is defined in terms of the ther- 


modynamic expectation value of some microscopically defined function that 
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is not invariant under a symmetry operation under which the microscopic 
Hamiltonian remains invariant. There is no such obvious broken symmetry 
in the case of the gas-liquid transition, though the thermodynamic variable 


pL — pe still works as the relevant order parameter. 


Also widely referred to are magnetic systems described by lattice models 
that can be described and investigated in relatively simple terms (though 
exact results are few and far between). While these models involve a 
number of essential simplifications, they can be mapped to numerous 
non-magnetic real-life systems such as binary alloys. What is more, 
their discrete nature poses no problem in the context of critical phenom- 
ena since the latter depend essentially on long-range correlations and 
are independent of the small-scale structure of the model to a large ex- 
tent. Of particular interest is the Ising model in 2 and 3 dimensions 
(D = 2,3, dimensions D higher than 3 are also of great theoretical rele- 
vance; D = 1 does not show phase transitions for short-range interactions 
among sites). One once again observes critical behavior as outlined in 
sections 6.2.4 and 6.3 where now the relevant thermodynamic parame- 


ters, with their associated critical exponents (a, 3,7, 6) are as follows 


(specific heat at constant magnetization :) c~ |T — T.|~° 
(magnetization density :) m~ |T — T,|? 
(isothermal susceptibility :) x ~ |T’ — T.|~7 


(field variation on critical isotherm :) h ~ |m|®sign(m). (6-126b) 


Interestingly, there exists a correspondence between the thermodynamic 
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parameters for the systems described in (6-126a), (6-126b) in spite of the 
great difference in the physical nature of the two, where the two specific 
heats correspond to each other and, in addition, (py — pc) @ m, Kr © 
x, p @ h, with the magnetization playing the role of the order parameter 
in the paramagnetic-to-ferromagnetic transition in the case of the Ising 


model. 


In outlining the Landau-Ginzburg theory, we have referred to the Free en- 
ergy function F' defined in terms of the energy functional Hic. One could 
alternatively base one’s consideration of the Gibbs potential G which, for a 


magnetic system, is obtained from F' as 


G=F4M-h, (6-127) 


where M is the total magnetization, obtained by integrating the magneti- 
zation density over the volume of the system (recall that the theory is ap- 
plicable to systems of diverse descriptions by appropriately interpreting the 
thermodynamic parameters expressed by means of the various symbols). 
Formula (6-127) constitutes a Legendre transformation from the variables 
T,M to T,h, where, for a given direction of magnetization, one may use 


scalars m,h in the place of vectors. 


The critical parameters (a, 3,7, 6) in the liquid-gas transition are found to 
be the same as the corresponding parameters in the D = 3 Ising model, 
which is indicative of the existence of universality classes among systems 


of various descriptions. 


While the critical behavior can be explained qualitatively within the frame- 
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work of a mean field theory, the critical exponents obtained from that 
theory do not conform to results of experimental observations and nu- 
merical simulations. What is more, the mean field theory does not yield 
the scaling of the correlation functions (the pair correlation in the case 
of a liquid-gas system and the spin correlation for the Ising system) near 
the critical point. As we saw in sec. 6.3.4, one can obtain the value of 
the critical exponent v relating to this scaling within the framework of the 
Landau-Ginzburg theory though, once again, this value may differ from 


the actual one. The scaling law of the correlation length (¢) is of the form 


ere |= 7), (6-128a) 


and the correlation length itself is defined in terms of the spatial decay of 
the two-point correlation at T = JT, 


1 
Oa 
pD-24+7 


C(r) (6-128b) 


where r stands for the separation between the points involved in the cor- 
relation (C(r) = (s;5;) — (s;)(s;) for the Ising system where s;, s; are the 
spins at sites labeled i,j; at the separation r) and 7 is one other critical 


exponent of interest. 


The renormalization group (RG) approach puts together a successful the- 
ory explaining the critical behavior of systems undergoing continuous 
phase transitions, providing us with a framework for the correct classifi- 
cation into university classes (recall that the Mean field theory bunches 


all systems into one single universality class regardless of the dimensions 
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D,d) and for the computation of the correct critical exponents character- 
izing the various classes. Additionally, it proposes a number of relations 


between the critical exponents on the basis of a scaling hypothesis. 


6.4.2 Scaling 


The scaling hypothesis, put forward by Widom, is based on the idea that 
the correlation length (¢) provides the only relevant length scale near the 
critical point, and on the related assumption that the singular part of the 
free energy close to the critical point (where one disregards a part that 
depends analytically on the arguments 7, m) is described by a function of 


the form (refer to [67], [128]) 
gee 
F(t, h) =T eee (6-129) 
J 


In this formula, the field variable h on either side is to be replaced with 
the appropriate functional dependence on 17,m, since these are the natu- 
ral variables for the free energy (recall that it is the Gibbs free energy that 
has 7,h as the natural variables and the property expressed in the above 
formula applies to G as well, without the necessity of replacing 7,h with 
T,m). Here we have used scalar variables m,h for the sake of simplicity; 
in the case of d-dimensional vectors, one has to distinguish between their 
longitudinal and transverse components. In the above expression, a, A 
are scaling constants that determine the critical exponents for the other 
thermodynamic variables. In other words, only two among all the critical 
exponents introduced earlier (a, 6,7, 6,v,7) are independent ones, the lat- 


ter being related to these two and to one another by scaling relations, a 
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number of which will be stated below. 


Generally speaking, the functional dependence of F may differ on the two 
sides of the transition, i.e., for 7 < 0 and 7 > 0, though the constants a,A 
do not differ. This possible dependence of the functional form will be left 


implied in the following. 


A homogeneous function g of degree r of a single real variable is one 


satisfying the requirement 


g(Ax) = A" g(x), (6-130a) 


for all real numbers \ and for all x in the domain of definition of g. This 
can be generalized to homogeneous functions of more than one real vari- 


ables; thus, the function g(x,y) of two real variables satisfying 


for some constant r and for all real \, is homogeneous of degree r. 


Further, a generalized homogeneous function g of, say, two variables, is 


defined by the requirement that 


g(A°x, A°y) = AQ(zx, y), (6-130c) 


for two constants a,b and for all \, which implies that (6-130b) is a special 


case corresponding to a = b, for which the degree is r = +. 
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It is straightforward to see that the partial derivatives (of any order) of 
a generalized homogeneous function of two variables are themselves of 
the generalized homogeneous type. Further, a generalized homogeneous 


function g of two variables can be expressed in the form 
g(@,y) = 9(), (6-131) 


for some appropriate \, A; and conversely, a function of the form (6-131) 
is a generalized homogeneous function (check these statements out). Hence 
the singular part of the free energy of (6-129) is a generalized homoge- 
neous function and further, the singular parts of thermodynamic func- 
tions that can be obtained from it by partial differentiation of various 
orders are all of the same type, each with its associated scaling constants 


is, 


In the following, the symbol ‘~’ will be used to denote the singular part of 
a variable (occurring on the left side of the symbol) as 7 —> 0 (at times the 
independent variable may refer to the field h or to some other thermody- 
namic parameter) or the most singular part, as the case be. Or else, in 
the case of an analytic function, the symbol may be used to denote the 
leading approximation for 7 — 0. Numerical pre-factors occurring to the 
right of the symbol will often be suppressed. All this does not adversely 
affect the identification of the critical exponents and the relations among 


the latter. 


Starting from the formula (6-129) for the free energy F one can work out 


how the internal energy U behaves near the critical point by making use 
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of the relation U = ae, where 6 = (kgT)~' and T is related to 7 by (6-109). 


One obtains for h = 0 

Uw, (6-132a) 
(check this out) and, by a further differentiation with respect to T 

cw rT. (6-132b) 


This tells us that the parameter a appearing in (6-129) is the same as the 
critical exponent a introduced earlier (refer back to formulae (6-126a), 
(6-126b)). With one of the two scaling parameters a, A of eq. (6-129) fixed 
as one of the critical exponents, the other parameter A, referred to as the 
‘gap exponent’, can be related by means of the standard thermodynamic 
formulae. In other words, the identities involving the critical exponents 


are nothing but a re-statement of the thermodynamic relations. 


On observes that, by Widom’s scaling hypothesis, the same gap exponent 
A occurs in the scaling forms of all the thermodynamic parameters of the 


system under consideration. 


Thus, going over from F' to G by a Legendre transformation, the magnetic 


moment is obtained as 


des 
Oh h=0; 


m= 


(6-133a) 
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from which one finds that 
B=2-a—A, (6-133b) 


(recall that the critical exponent ( is to be distinguished from the inverse 
temperature (kj7~')). On the other hand, considering the limit 4 —> 
oo (h £ 0,7 — 0), on obtains the relation between m and h on the critical 


isotherm (7 = 0), from which follows the relation 


A 
CS: 
6 


(6-133c) 
Finally, one obtains the susceptibility from the formula y = ae |n-o thereby 


arriving at the relation 
y= 2A—-—2+a. (6- 133d) 


Having expressed the critical exponents a, 3, 7,6 in terms of the two scal- 
ing parameters a,A of eq. (6-129), one obtains the following identities 


satisfied by these exponents 


(Rushbrooke’s identity :) a+26+ y= 2, 


(Widom’s identity :) 6-1= ay (6-134) 


p 


The critical constants can be worked out in approximate theories of criti- 
cal phenomena (notable among these being the Landau-Ginzburg theory) 
and in numerical investigations on various model systems, while exper- 


imentally determined values for real-life systems are also available. Fi- 
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nally, the critical exponents for various possible universality classes can 
be obtained from the renormalization group theory which is an exact the- 
ory that allows us to compute the critical exponents numerically with 
quite high accuracy. The critical exponents obtained from these diverse 


sources are all found to satisfy the identities stated above. 


It remains to relate the exponents v, 7 characterizing the correlation length 
€ and the two-point correlation function C(r) to the other critical expo- 
nents. Here one makes use of the observation that, close to the critical 
point, the correlation length € is the sole length scale determining all the 
thermodynamic parameters of the system. Additionally, one can invoke 
the relation between the static response function (,) and the fluctuations 
of the order parameter expressed by the two-point correlation function 
(as expressed in the static form of the fluctuation-dissipation theorem, 
see [67], sec. 7.4, [99], sec. 7.4; static response has been considered in 
sec. 8.4.7.3 in the general context of dynamic response theory). These 
two physical principles are consistent with the following scaling forms of 


€ and C(r): 


), (6-135) 


and imply the relations 


y=(2-n)v, 2-—a=Dy, (6-136) 


where the second relation is referred to as Josephson’s identity. Rela- 


tions among the critical constants involving the dimension D are termed 
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hyperscaling formulae, an instance of which is provided by the Joseph- 
son identity mentioned above. The latter, however, breaks down for D > 4, 


i.e., beyond the upper critical dimension. 


Details regarding the derivation of the scaling formulae given above are to 
be found in [67], [114], [128], [99]. [100], chapter 5, includes a wealth of 


details. 


The scaling formulae mentioned above are all consistent with the prin- 
ciple of dilation symmetry ([128]) according to which a system becomes 
self-similar (in a statistical sense) under a scale transformation at the 
critical point. This constitutes the basis of the renormalization group 


approach to critical phenomena. 


6.4.3 The renormalization group transformation 


The renormalization group transformation constitutes a method for the 
identification of the critical temperature and calculation of the critical 
exponents for systems of diverse types. The basic principles can be stated 
with reference to a lattice-based spin system, though the approach works 
for systems of various other types since the characteristics relating to the 
critical transition do not depend on the constitution specific to a system, 


being determined only by the universality class the system belongs to. 


We consider a system made up of spins located at the sites of a lattice in 
D dimensions, a spin being a hypothetical 2-state system specified by a 


state variable that can assume either of the two values s = +1. We label 
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the lattice sites with the index 7, so that the spin at site 7 in any given spin 
configuration, represented by the symbol {s}, is denoted by s;. This is by 


way of recapitulation of what has gone in earlier sections in this book. 


For the sake of simplicity, we do not distinguish here between the spin 
variable at a site and the value of that variable (+1 or -1) in any given spin 
configuration on the lattice. This implies a certain degree of indiscipline, 
but will not be of harmful effect, as it has not been in earlier sections of this 


book. 


Close to criticality, the correlation length (¢) is the only relevant length 
scale describing the state of the system, where £ — co as T — T.. In other 
words the system looks the same, in a statistical sense, on all length 
scales as the critical point is approached. Imagine now that the lattice 
is inflated by a scale factor b by replacing b” number of spins located at 
a block of lattice points with a single spin of the re-scaled system. In 
other words, a block of b? number of unit cells of the original lattice (call 
it L,;; it is convenient to think in terms of a simple cubic lattice in D 
dimensions) is now replaced with a single unit cell of the re-scaled lattice 
(Lz). Given a spin configuration in L;, the spins at all the sites of the block 
mentioned above are assumed to be replaced with a single spin in L, by 
some appropriate rule of correspondence, so that the mapping from the 
spin configuration {s} in L, to the corresponding configuration in L» is 


unique. 


Assuming that the sites in Ly are labeled with index a, the set of spins 


s; at sites (with indices i ranging through a set of b? number of values) 
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in any particular block of L; is transformed to a spin s’, at a single site 
(with some appropriate value of the label a) of Ly. If the average of the b? 
number of spins in the block of L, (for the given spin configuration under 
consideration in L,) be, say, s, then a possible rule of transformation 
would be si, = +l if s > 0, and si, = —-1if s < 0. With this, or any 
other similar rule of transformation, one obtains, from any given spin 
configuration {s} in L;, a well-defined spin configuration {s’} in Lz, though 
more than one configurations in L; may give rise to a single configuration 
in L». This transformation from spin configurations in L, to those in 
L, may be looked upon as a coarse-graining by a scale factor b, and is 
referred to as a block spin transformation in the case of a lattice-based 


spin model. 


The block spin operation is illustrated in fig. 6-16 below which depicts 
schematically a 2D Ising lattice (D = 2) partitioned into blocks containing 
3 x 3 = 9 spins each (b = 3). As a result of this operation the number of 
spins is reduced by a factor of b?(= 9) since each block of spins in the 
original lattice (L,) is replaced with a single spin in the re-scaled lattice 
L,. As mentioned above, the value of the spin at any chosen site in L, is 


determined by the spins in the block (in L,) it comes from. 


Starting from the Hamiltonian H({s}) of the original system one can de- 
fine a Hamiltonian H’({s’}) resulting from the block spin operation as 


follows: 


We consider some particular spin configuration {s’} in Ly, chosen arbi- 


trarily, and denote by C({s’}) the set of configurations in L; that lead to 
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Figure 6-16: Illustrating the block spin transformation for a 2D Ising lattice (D = 2); 
the lattice (L;) shown on the left is partitioned into blocks containing 3 x 3 = 9 spins each 
(b = 3); as a result of this operation the number of spins is reduced by a factor of b? (= 9) 
since each block of spins in L; is replaced with a single spin in the re-scaled lattice Ly; 
the value of the spin at any chosen site in Ly is determined by the spins in the block 
(in L,) it comes from; the correspondence between blocks in L; and single spins in Lz is 
shown by means of dotted arrows; the spins in Lz are depicted as ‘up’ (+1) or ‘down (-1) 
in accordance with the ‘average rule’ mentioned in the text applied to the spins in the 
blocks in L; they come from; spin configurations in two blocks of L; are shown by way 
of example, while the configurations in other blocks, indicated by a dotted rectangle and 
dots, are not shown; correspondingly, the states of two of the spins in Lz are shown by 
arrows, while those of other spins are not shown. 

{s'} in the transformation defined above (i.e., a re-scaling followed by a 
transformation of spin configurations {s} — {s’}). We now define H’({s‘}) 


by the requirement 


eW PHS) — gH Se BHU), (6-137) 
{s}eC({s'}) 
where , is some constant that may be determined from the requirement 
of the form-invariance of the Hamiltonian under the block spin transfor- 
mation since the theory to be formulated requires that the transformed 
Hamiltonian H'({s'}) be of the same form as the original Hamiltonian 
H({s}) (this requirement has to be satisfied as a natural consequence of 
the self-similarity under coarse-graining close to criticality). On looking 


for the general form of a spin Hamiltonian H({s}), it turns out that a nat- 
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ural requirement to be satisfied by the latter is }/,,, H({s}) = 0 [67]. In 
order that the same requirement is satisfied by H’({s’}), the pre-factor e%“ 
is included on the right hand side of (6-137), where N is the number of 


spins in L, and yz, is determined from the above requirement on H’({s’}). 


Having defined the transformed Hamiltonian H’({s’}) this way (recall that 
the configuration {s’} was chosen arbitrarily) one observes that the parti- 


tion function remains essentially unchanged under the transformation: 


Z= ere — @Nu Sea = @Nvz. (6-138) 
{s’} {s} 

where the second equality is obtained by noting that the union of the sets 

C({s‘}) (recall that C({s’}) is the set of configurations in L; that give rise 

to the configuration {s’} in L, under the transformation) is the set of all 


possible spin configurations in Ly, ie., )) p44 orsecyis}) = Defs}- 


As mentioned above, the form-invariance of the Hamiltonian and the in- 
variance of the partition function (in the sense mentioned above) is a con- 
sequence of the fact that the transformation describes a coarse-graining 
(by means of inflation by a scale factor b) under which the system remains 
unchanged in a statistical sense. Now, the original Hamiltonian and the 
partition function are defined in terms of a set of parameters that we col- 
lectively denote by the symbol K where, in general, K stands for a vector 


with an infinite number of components. 


In the case of a Hamiltonian of some special form like, for instance, one with 


the nearest neighbor interaction, K may be made up of only a few compo- 
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nents such as the single coupling strength J of the exchange interaction in 


the Ising model. 


The coarse-graining described above then defines a set of parameters K’ 
specifying the Hamiltonian H’{s’}, i.e., in the end, corresponds to a trans- 
formation K — K’ in the parameter space. In the case of the most general 
form of the spin Hamiltonian, this is indeed a well defined transformation 
in the infinite dimensional parameter space. However, if one starts with 
a Hamiltonian of some special form (such the one based on the nearest 
neighbor interaction) where K has only a finite number of components, 
then it is not guaranteed that the transformed Hamiltonian will be of 
that same special form (a trivial counterexample is the 1D Ising model, 
as we see below). One then has to judiciously choose a subset of the 
components of K’ in terms of which a transformation K — K’ be formu- 
lated that captures the essential requirement of self-similarity under the 


coarse-graining. 


The mapping K -— K’ defined by the above prescription is referred to as 
the renormalization group transformation. It possesses the group prop- 
erty of closure under composition since, when followed by a second trans- 
formation K’ > K” (corresponding to a scale change from Ly to, say, Ls), it 
defines a transformation K — K” of the same type (the associativity prop- 
erty is also satisfied), though it does not possess an inverse (the mapping 
{s} — {s’'} is many-to-one; coarse-graining can be well-defined, but ‘fine- 


graining’ is not). 


We introduce the function (a nonlinear operator) R in the parameter space 
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as 


K > K’ = R(K). (6-139) 


The function R includes all the information regarding how the system 
under consideration behaves under coarse-graining. In particular, one 
can identify the fixed points of the renormalization group transformation 


as solutions of 


R(K) =K. (6-140) 


In a numerical investigation one needs to set up a suitable approxima- 
tion to R, defining it in an appropriately truncated parameter space. One 
can then determine the fixed point(s) of the transformation and the lin- 
earized behavior near a fixed point of interest (See sec. 6.4.5.2 below). 
In sec. 6.4.4 below, we first consider the RG transformation for the 1D 
Ising model (for zero external field) for which the parameter space is one 
dimensional since there exists an exact construction of the operator R. 
Recall, however, that the model does not admit of a continuous phase 


transition since an ordered state exists only at T = 0. 


The renormalization group approach outlined in the above paragraphs 
is referred to as real space renormalization, where the coarse-graining, 
re-scaling, and renormalization are all done with reference to the ambi- 
ent space (of dimension PD) in which the system under consideration is 
placed. An alternative renormalization program can be formulated with 


reference to the momentum space related to the real space by Fourier 
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transformation. This will be briefly mentioned in sec. 6.4.6 below, where 
a perturbation technique — the so-called c-expansion — will also be referred 


to as one of great relevance. 


6.4.4 The RG transformation for the 1D Ising mdel 


We begin with the Hamiltonian of a 1D Ising chain containing N spins 


with open boundary condition, expressed in the form 


N-1 
H({s}) =-J > sisigs. (6-14 1a) 
i=l 


Introducing the dimensionless coupling constant kK = (J so that 


N-1 
—BH=KY> sisins. (6-141b) 


i=1 


and partitioning the entire spin chain into blocks of three spins each 
(b = 3) as shown in fig. 6-17, we write the partition function 7(K) = e-°" 


as 


Fix Se rete _ S- > S- sah Ce aa aa es ‘il, (6-14 1c) 
{s} 


8,=1,2 s2=1,2 s3=1,2 


The transformed Hamiltonian H’({s’}) and the partition function function 


Z'(K') can be constructed by the following rule of correspondence. 


The spin si resulting from s,, 52,53 under the block spin operation is set 
at s = s2; similarly, s resulting from s4,55,s¢ is set at s) = s;, and so 
on (i.e., s, = S3r-1, (kK = 1,2,3,---): the ‘central spin’ rule). This rule of 


correspondence appears to differ from the ‘average rule’ mentioned earlier 
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(sj, = +1 if s > 0, si, = —1 if s™ <0, where 5) = 42**3-1**sk) but the 
two prescriptions lead to the same result for sufficiently low temperatures 
where the system can be described in terms of large blocks of ordered 


spins. 


wp IN ag FEES, 


Leet eee eee! og he L 


Figure 6-17: Depicting the block spin operation for a 1D Ising chain (D = 1) of spins; 
blocks of three spins (b = 3) in the original lattice L; shown on the left are replaced 
with a single spin each in the re-scaled lattice Lz. shown on the right, the correspon- 
dence between blocks and single spins being shown by means of dotted arrows; a spin 
configuration in L; corresponds to spin values 51, s2,53,---; in the corresponding config- 
uration s/, 55, 55,:-: in Lz resulting from the block spin transformation, s‘/, is determined 
by s1, 52,83 in accordance with a specified rule of correspondence, s‘ is determined by 
83, 84,85, and so on; the spins in Lz are depicted as ‘up’ (+1) or ‘down’ (-1) in accordance 
with the ‘central spin rule’ mentioned in the text applied to the spins in the blocks in L, 
they come from; three blocks in L; and three spins in Ly are shown. 


Making use of the above rule of correspondence in the block spin trans- 


formation, and the principle (6-137), one obtains 


e7 Ne BH'({s'}) = S- ek s151 os 93 6K s3sa .Ksasy eet (6-142) 
S3S485S687.... 

(reason this out). In the above expression, we have suppressed the sum 

over s; — the sum over sy being similarly suppressed. This effectively 

corresponds to a fixed boundary condition (recall that, generally speak- 


ing, boundary conditions are not relevant in the thermodynamic limit); 
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at the same time we will drop the first and the last factors containing 
51,5 (we assume that N is a multiple of three) in the product on the right 
hand side. One can, of course, work out the transformed partition func- 
tion even without these simplifying measures, but the result would be the 
same in the thermodynamic limit since these correspond to manipulating 


the boundary spins in the chain. 


Let us now work out the sum on s3,s,, and similarly on 53,53,,1 (k = 
2,3,---). When these sums are performed, what will remain are the spins 


s, 8,::. We work out, as a sample, the sum on s3, s4 as follows. 


Note first of all that, for two distinct spins sq, s;, 
e $a — cosh K(1+ s,s) tanh Kk), (6-143a) 


(check this out; recall that each of s,, s, can have a value +1, i-e., s? = 5? = 


1). One then obtains, 


s. ek 9193 8384. Ksasy — 4 cosh? K(1 fhe S155 tanh® KY, (6-143b) 


$3,584 


(check this out). The sums over s3;, 53,41 (k = 2,3,---) are similarly per- 


formed. Making use of the identity (6-143a), one obtains, from (6-142), 


! 4cosh® K N’' Tol ol Tebogl N 
—Nu,-BH' _ K's) 89 (K's983 . N= ~— -144 
_ ( cosh Kk’ ) ( 3) C a) 
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where we introduce a transformed dimensionless coupling constant K’ as 
tanh K’ = tanh? K, (6-144b) 


(check this out). One thus observes that, identifying the pre-factor e~“” 


‘ 3 N’.: - ‘ ‘ : ‘ 
with (4:%"*)”" in this case, the transformed Hamiltonian is obtained as 


N’-1 
—BH'({s'}) =K" ) si5444, (6-145) 
i=l 


which is exactly of the same form as (6-141b), where the coupling con- 


stant Kk’ is obtained from Kk by the RG transformation 
K + K'(=R(K)) = tanh ‘(tanh’ K). (6-146) 


Referring back to (6-142), recall that it was obtained by doing a partial 
sum on e~84({s}), If we now make a summation over the remaining spin 
variables (i.e., over s2(= s/), 55(= 54),---), we will end up with 7 on the right 
hand side and e~““Z’' on the left. Noting that 7 and Z’ are obtained as 
spin sums from identical Hamiltonians with just the coupling constants 


and the number of spins differing, we arrive at 


a NY Z'(N’, K') =Z(N,K) (6-147a) 
where 
Ves. 
3 
1 cosh? Kk 2 
= 3 In an ra = In. (6-147b) 
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and Kk’ is related to K as in (6-146). These constitute the basic results of 


the present section. Noting that, for large N, In Z is of the form 
In Z(N, K) = —NC(K), (6-148a) 


where ¢ equals { times the free energy per spin and depends only on the 


dimensionless coupling constant kK, formula (6-147a) can be written as 


((K) = WK) + 508) 


(ie.,) C(K’) = 3¢(K) — 3n(K). (6-148b) 


where ju(/) is given by the second equality in (6-147b), in which k’ is to 
be expressed in terms of K as in (6-146). These results can be stated 
in the reverse since, given Kk’ and ¢(K’) one can obtain K and ¢(K) even 


though the block spin operation itself cannot be reversed. 


These results can be stated as follows: starting with the partition function 
Z and hence ¢ (which is nothing but —( times the free energy per spin (/)) 
for a given value of K = K), say, one can obtain ¢(K)) from (6-148b), 
where K) results from kK“ by the RG transformation (6-146). This op- 
eration can be repeated so that one obtains a sequence of values of the 


dimensionless coupling constant 


KY 4 KF 4... K™ (n> w) 
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Figure 6-18: Depicting the RG transformation (formula (6-146)) for a 1D Ising chain 
in a plot where successive values of the dimensionless coupling constant K are shown 
under repeated applications of the transformation (shown by means of arrows); starting 
from an arbitrarily large value of K(= K“)), progressively smaller values (K?), K),---) 
are generated, with K‘ — 0 for n —> oo; in the one-dimensional parameter space, 
K =0 and K = o represent respectively the stable and unstable fixed points of the 
transformation; on applying the transformation in a reverse direction, one approaches 
ik — oo, which corresponds to a phase transition under the unphysical condition T = 0. 


along with the corresponding values of ¢ = —/f, 


CCRC) CRO) SCA"): (e005) 


In this sequence of transformations, the value of the coupling constant 
decreases at each iteration (K“) > K@) >... check this out). We con- 
fine our considerations to a ferromagnetic interaction (J > 0), for which 
the dimensionless coupling constant is positive (kK > 0) at every stage of 
iteration, and the sequence of RG transformations causes the coupling 
constant to tend to zero, as depicted in fig. 6-18. We note that, for a given 
value of the strength of exchange interaction J, K going to 0 effectively 
means T' — oo, i.e., the sequence of RG transformations causes the sys- 
tem to tend to a configuration where the spins are effectively decoupled 


from one another. 


While the block spin transformation itself ({s} — {s’}) does not possess a 
well-defined inverse, the sequence of RG transformations can be run in 


reverse where, starting from a low value of K (= 0) and from the corre- 
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sponding ¢(/’), one can obtain the values of ¢(/) for a sequence of suc- 
cessively higher values of kK, with kK — oo. If the initial value of K is taken 
sufficiently close to 0, one can assume ¢ ~ — |n2 (reason this out), and a 
few steps of reverse iteration provides us with good estimates of ¢(= —{f) 
for successively lower values of temperature, the estimates being progres- 
sively more reliable as one starts with smaller and smaller initial values 
of kK, as can be verified by referring to the exact formula (6-10b) for the 


1D Ising model. 


The two limiting values, K = 0 and K = o, constitute the fixed points of 
the RG transformation (6-146) for the 1D Ising chain, of which the former 
is a stable fixed point, while the latter is an unstable one (the stability 
characteristics get reversed under the inverse transformation Kk’ > k). 
The unstable fixed point (K = co, T = 0) corresponds to the ‘critical point’ 
for the model which, however, is an unphysical one. In contrast, Ising 
models in dimension D > 2 characterized, in general, by more than two 
parameters possess critical points at temperatures 7’ > 0. We now turn 
to the consideration of the RG transformation for such systems of more 


general description. 


6.4.5 RG transformation: general considerations 
6.4.5.1 The 2D Ising model 


The 2-dim (or, ‘2D’) Ising model is a system of 2-state spins arranged ina 
two dimensional array, where the array is commonly taken to be a sim- 
ple square lattice. Generally speaking, the model involves two coupling 


constants, one pertaining to the nearest neighbor interaction along rows 
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and the other to interactions along columns of the array. Even when both 
the coupling constants are assumed to be the same the RG transforma- 
tion resulting from the block spin operation differs fundamentally from 
the one we found for the 1D Ising chain in that the interactions in the 
decimated lattice are no longer of the simple nearest-neighbor type. Put 
differently, the Hamiltonian resulting from the block spin operation is not 
of the same form as the original Hamiltonian,and one has to keep track 
of several parameters (K,, K2, K3,---) as the RG transformation is applied 


repeatedly. 


In the case of the 2D Ising model one can, by cleverly combining the 
parameters corresponding to the non-nearest neighbor interactions with 
the nearest neighbor coupling constant, reduce the problem to one where 
one needs to look at the RG transformation of one single parameter 
([97]) as in the 1D Ising chain (see [99], sec. 8.4 for for a treatment 
based on two parameters). Such an approximation works well for the 2D 
model, but the transformation differs fundamentally from that in the 1D 
case in that now there are three fixed points, two of which at kK = 0, Kk = 
oo are stable ones while the third, at an intermediate value of K, is an 
unstable fixed point. The flow in the one dimensional parameter space 
(I repeat that this one dimensional space arises only as the result of an 
approximation) under repeated applications of the RG transformation is 
shown schematically in fig. 6-19 where the unstable fixed point A* is 
found to correspond closely to the temperature of the paramagnetic to 
ferromagnetic transition as predicted by the Onsager theory (refer back 


to sec 6.2.4.5). 
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Figure 6-19: The RG transformation for a 2D Ising chain shown in a parameter 
space of one single dimension (schematic), constructed as an effective truncation from 
a higher dimensional space; the Hamiltonian is not form-invariant under the block spin 
operation, as a result of which the RG transformation involves more than one parame- 
ters, which are combined appropriately to arrive at the single effective parameter K; the 
nature of the flow, shown by means of the arrows, in this one dimensional parameter 
space shows that, in contrast to the case of the 1D Ising chain (fig. 6-18), there are two 
stable fixed points at K = 0,co and, in addition, one unstable fixed point at an inter- 
mediate value K = Kk“, that corresponds closely to the paramagnetic to ferromagnetic 
phase transition predicted by the Onsager theory. 


These observations based on the 2D Ising model can be stated in more 
general terms. We consider a spin model with spatial dimension D and 
assume for the sake of simplicity that the RG transformation can be ex- 
pressed in terms of one single effective parameter K (the case of more 
than one effective parameters will be considered below). At sufficiently 
low temperatures, the system is comprised of large blocks of ordered 
spins, and the energy is then dominated by interactions between the 
boundary spins of adjacent blocks. In the block spin operation, each 
of the blocks gets reduced to a single spin in the re-scaled lattice, and 
the interaction between the block boundaries in the original lattice ap- 
pears as the interactions between adjacent spins in the re-scaled lattice. 
Assuming a scale factor b characterizing the block spin operation, the ef- 
fective coupling in the re-scaled lattice is related to that in the original 


lattice as, roughly, 


K'~ b?-'K, (6-149a) 


(reason this out). One observes that, for D > 1, a repeated application 
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of the RG transformation leads to kK — oo, which means that the point 
K = o (T = 0) in the parameter space is a stable fixed point, in sharp 
contrast to the 1D case. The point at kK = 0 (T = oo) continues to be a 
stable fixed point (the formula (6-149a) does not apply to this case since 
the system state is not dominated by block spins). One then infers that 
there has to be an intermediate temperature T = T, and an intermediate 
value of the parameter K(= K*) that corresponds to an unstable fixed 
point in the parameter space. That this actually characterizes a critical 
point can be inferred by looking at the transformation of the correlation 


length € (see below). 


For a complete description of how the system behaves under a scale 
transformation sufficiently close to a critical point (when the spins are 
correlated over large distances), one has to state, in addition, the formula 
for the transformation of the free energy. This is obtained from (6-148b) 


as 


6(K’) ~ b°¢(K), (6-149b) 


(recall that ¢ stands for the free energy per spin in units of kyT) where we 


consider the transformation of only the singular part of the free energy. 


One can supplement the above formulae with the transformation of the 
correlation length ¢ near the critical point by noting that, in a scale trans- 


formation, lengths get inflated by a factor of b. Since € has the dimension 
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of length, one obtains 


€(K) ~ be(K’). (6-150) 


Consider now a point K close to the unstable fixed point A*, starting from 
which n number of iterations of the RG transformation lead to the point 
Ko (Say), located between K* and (Kk =)0, at which the spins are effectively 
uncorrelated, i.e., (Ko) is of the order of unity (the lattice constant). One 


then has, from (6-150), 


E(K) ~ b"E(Ko). (6-151) 


As the location of the initial point kK is chosen to be more and more close 
to the unstable fixed point K*, the number of iterations (n) required to 
reach Ko will increase unboundedly. Thus, since ¢(K,) ~ 1, we obtain 
&(K) + co as K > k*. In other words, the unstable fixed point A* indeed 
corresponds to a critical point for which the spins are correlated over an 


infinite distance. 


Note from the transformation formula (6-150) that the fixed points of the 
RG transformation correpond to either € = 0 or € = oo. The former cor- 
responds to an attracting fixed point under the RG transforma- 
tion relating to a disordered state at T’ = oo while the latter can 
be either an attracting fixed point at T = 0 or a repelling fixed 
point (or, more precisely, a fixed point of the ‘mixed type — at- 
tracting along some directions and repelling along some others) 


corresponding to a critical point. 
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Starting from the RG transformation rules of the free energy and the 
correlation length, one can work out the critical exponents from the lin- 
earization of the RG transformation near K = K*. We illustrate this for 
the simple case of a one dimensional parameter space by working out the 
critical exponent v characterizing the correlation length (recall that, near 


the critical point, the correlation length scales as £ ~ 77”). 


In the linear approximation around the unstable fixed point K = k~*, the 


RG transformation (6-139) appears as 


ae eR HRI. (6-152a) 


In order to obtain the scaling form of the RG transformation of ¢, we 


express the constant R’(K*) as 


Ri(K*) = be. (6-152b) 


The parameter Kk being a function of T (recall that, for the 1D Ising model, 
k= eee a similar relation holds for the 2D Ising model with identical 
exchange coupling constants along rows and columns), one can write for, 


KeK* (TS T,), 


. dk 
K-K*s (T — T.) (Fa) ran: (6-153) 
The scaling form € ~ |r|~” can then be written as 
Ce P= elk =k ™, (6-154a) 
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On comparing with (6-150) and making use of (6-152a) one obtains 


ee (6-154b) 


(check this out). 


This procedure can be generalized to the case of a system for which the 
number of parameters K,, K2,---) characterizing an equilibrium configu- 
ration is greater than one. For instance, one can refer to the parameters 
T,p in the case of a simple fluid, while the corresponding parameters for 
a simple magnetic system would be T, / (in the above paragraphs dealing 
with the RG transformation for the 1D and 2D Ising models, we have con- 
sidered the special case h = 0). In section 6.4.5.2 below, we consider the 
linearized RG transformation near a critical point for a system of such 
general description and obtain the critical exponents in terms of a set of 
eigenvalues of the transformation. In the above example of a one dimen- 
sional parameter space, there is just one single eigenvalue \ = R’(k*), 


and the critical exponent v is related to this as (refer to (6-152b)) 


7 Inb 


6.4.5.2 The linearized RG transformation: critical exponents and 


universality 


Referring to the RG transformation of the form (6-139) for a system for 
which K stands for a set of parameters K = { kj, Ko,---}, let us consider 
an unstable fixed point K, of the transformation corresponding to a criti- 


cal state as in the case of the 2D Ising model touched upon in sec. 6.4.5.1 
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(note the slight change in notation, the fixed point having been denoted 
by K* in earlier paragraphs). The linearized RG transformation in a close 


vicinity of the fixed point K, = {K\., Ko,,---} is of the form 


Ray Ee GIs). (6-156a) 


Defining 


O(R(K)); 


u=K-kK,, wi = | aK; 


egg ae), (6-156b) 


one can abbreviate the above linear approximation to the RG transforma- 


tion around the fixed point K, as 


ul = S > wiyu; (6,9 = 1,2,---). (6-156c) 

j 
Sufficiently close to the critical point, the matrix w with elements w,; 
describes completely the RG transformation. Since it is not, in gen- 
eral, a symmetric matrix, one has to distinguish between its left and 


right eigenvectors ( the corresponding eigenvalues are equivalent). Let 


v4, v@),... denote the left eigenvectors of w, corresponding to eigenvalues 
AY 2)... In other words, 
Soy ag OU” gS ha aS 23s): (6-157) 


a 


We now introduce, in the place of the original parameters u;(= K; — 
(K.), (i = 1,2,---)), a set of transformed parameters x (a = 1,2,---) 


(referred to as the scaling variables pertaining to the fixed point under 
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consideration) as 
2 = S~ fu, (6-158) 


The RG transformation of these scaling variables is of a simple multi- 


plicative type: 
x) = Nz (0), (6-159) 


Hence it is of advantage to make use of these in stating the RG trans- 
formation rules of (the singular part of) the specific free energy and the 


correlation length (refer back to (6-149b), (6-150)): 


(a, 2®,---) =b-PC(AMa2, hp), ..-), 


a), 2, ...) =v AM2M, Ae, (6-160) 


A scaling variable x (recall that the x’s are obtained from the K’s by 
the transformation (6-158)) is said to be ‘irrelevant’ if the corresponding 
eigenvalue \') satisfies \ < 1, while the condition \“ > 1 defines a 
‘relevant’ scaling variable (we do not consider the marginal case \“ = 1 
since it is rare and needs a more refined analysis). Irrelevant variables 
get quenched under repeated RG transformations and so the RG trans- 


formation is effectively described in terms of the relevant variables alone. 


Consider, in the space of the z’s, a surface passing through the fixed 
point K = K, (i.e., 2 =0 for all a) for which the relevant co-ordinates x”) 


are all zero. This is referred to as the critical surface. An initial point on 
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critical point 


critical surface 


7 
—- 
- 
a 


Figure 6-20: The critical surface in the parameter space with co-ordinates 2 (a = 
1,2,---), defined by the condition that the relevant co-ordinates are all zero (schematic); 
initial points on this surface approach the unstable fixed point (2) = 0 for all a), corre- 
sponding to a critical state with K = K., while points slightly away from it eventually get 
separated under repeated application of the RG transformation; the rates of separation 
in the latter case (along the co-ordinate axes corresponding to the relevant parameters) 
define a universality class characterized by the critical surface; the axes shown corre- 
spond to the original co-ordinates (ky, Ko, K3,---). 


this surface will approach the fixed point under repeated applications of 
the RG transformation (by the action of the irrelevant eigenvalues), while 
a point slightly away from the surface will progressively move away from it 
by the action of the relevant eigenvalues. This is illustrated schematically 


in fig. 6-20. 


The critical exponents are all encoded in the eigenvalues \") (a = 1,2,---) 
as follows. We first express the eigenvalues as powers of the scale factor 


b that defines the RG transformation (6-139): 


a i= TO a2e |, (6-161) 


The use of a(= 1,2,---) once as a super-index and then as a sub-index 
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carries no special significance, and is adopted for the sake of typographic 


clarity. 


The exponents y, (a = 1,2,---) then lead to the critical exponents charac- 
terizing the universality class of the system, as can be seen by referring 
to the basic scaling relations (6-149b), (6-150), that apply for a system 
with any chosen value of the dimension D. For instance, we consider a 
magnetic system and assume, on physical grounds, that the parameters 
a“) = 7, «?) = h are the relevant ones characterizing the system, where h 
is now expressed in dimensionless form (i.e., in units of kg7; this consti- 
tutes a slight change in notation as compared to earlier use of the symbol 
h). Let the exponents y, corresponding to these two relevant parameters 


be y-, yn. The transformations (6-160) then appear as 


C(r,h) = 6 PC (bY 7, bh), 


E(7, h) = bE(b""7, bY*h). (6-162) 


These transformation relations hold for any arbitrarily chosen value of 
b (> 1) since any such value can be realized from an appropriately cho- 
sen initial value by repeated applications of the RG transformation. Put 
differently, the free energy per spin (¢) and the correlation length (¢) are 
generalized homogeneous functions of 7,h (refer to (6-130c); recall that 
¢,h are both defined in units of kgT). As mentioned in sec. 6.4.2, a gener- 
alized homogeneous function g(x,y) can be expressed in the form (6-131) 


for some appropriate exponents \,A. Indeed, considering the function 
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¢(7,h) and setting, for any arbitrarily chosen value of 7 


tL 
b=T ur, 


(6-163) 


(recall that the first equality in (6-162) holds for arbitrarily chosen }), one 


finds 
px h 
C(t, h) = Par ¢(=<x): 
where the gap exponent A is given by 
hea 
Yr 
and ¢ is defined by 
C(z) = ¢(1,2). 


(6-164a) 


(6-164b) 


(6-164c) 


Eq. (6-163) provides the basis of the Widom scaling hypothesis (refer 


to (6-129), where the exponent a can now be related to y,, see below). 


Starting with (6-164a), one obtains the scaling behavior of the (singular 


part of the) specific heat by differentiating twice with respect to 7 at h = 0: 


P_92 
Cw TY 
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(check this out) which thereby relates the critical exponent a with y,: 


j25 (6-165b) 
Ur 


(recall that the exponent 2 — a in the right hand expression in (6-129) 
was consistent with the critical behavior, of the form |7|~°, of the specific 
heat). In a similar manner, one obtains the critical exponent v describing 


the behavior of the correlation length (€ ~ |r|~”) as 


yas (6-166) 


(check this out). One can now follow the line of derivations of sec. 6.4.2 
and obtain all the critical exponents in terms of the two exponents y,, yp, 


thereby explaining the scaling relations among them. 


6.4.6 Momentum space renormalization and the epsilon 


expansion 


I present below a sketchy outline of what ‘momentum space renormaliza- 
tion’ and ‘epsilon expansion’ mean in the context of the renormalzation 
group theory. As in most other topics in this book, I skip derivations of 


results stated. Further details are to be found in [128], [67], [114], [18]. 


The renormalization group transformation outlined in the previous sec- 
tions is carried out in the ‘real space’, i.e., the ambient space (of dimen- 
sion D; we have considered the Ising model in for D = 1,2) in which th 


system under consideration is placed. We now consider the correspond- 
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ing transformation in the momentum space in which a point is charac- 
terized by the wave vector q, related to the real space variable r, by means 
of Fourier transformation. This is referred to as the momentum space RG 
transformation that yields the critical exponents in a manner analogous 
to what was found in the real space RG program, where these (the critical 
exponents) are related to the eigenvalues of the linearized transformation 


relations. 


In illustrating the momentum space RG program, we refer to the Landau- 
Ginzburg model defined in terms of an order parameter ¢(x), where x is 
a vector in the real space of dimension D while, for any given x, m may 
be a vector with d number of components. Such a model does not differ 
significantly from one describing spins on a discrete lattice since, close to 
a critical point in a phase transition, variations of the order parameter on 
a small scale in the real space bear no relevance. However, there exists a 


lower cut-off in |x 


, of the order of, say a, the unit cell size in real space. 


In the momentum space description, the model is described by the order 
parameter, say, ¢(q), obtained from ¢(x) by Fourier transformation, where 
there is an upper cut-off, say A, of the order of the size of the first Brilloun 
zone; since large values of |q| are not relevant, we assume a spherical cut- 


off in the momentum space. 


Our point of departure will be the Gaussian model obtained by retaining 
terms up to those of degree two in the order parameter (refer back to 
sec 6.3.2, where the notation differs slightly), which will act as the zero 


order term in the subsequent theory. Referred to the Fourier space, the 
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partition function and the LG Hamiltonian appear as 
Z5= / Do(q)e FH (6-167 a) 


where D refers to functional integration, and the Hamiltonian, involving 


the parameters 7, K,h reads 


[D] 
BHa= f Gools( + Ka )IO@)E] ~ h- dlaao (6-167) 


(check this out; refer back to sec. 6.3.2), where the integral in the q-space 


is cut off by |q| = 


The RG transformation in the parameter space (made up of 7, K,h; more 
parameters will have to be considered as terms of degree higher than two 
are included in the LG Hamiltonian) involves three steps described as 


follows (refer back for the sake of comparison to sections 6.4.3 and 6.4.4). 


(A) Elimination of redundant degrees of freedom by coarse-graining. The 
coarse-graining consists of elimination of the Fourier modes with wave 
vectors + < |q| < A (b > 1), corresponding to rapidly varying modes. The 
order parameter ¢(q) can be decomposed into two parts, with supports in 


0 < |q| < 4 and 4 < |q| < A respectively, 


o(q) = ¢<(qa) + 3 (q). (6-168) 


The two sets of modes marked with ‘<’ and ‘>’, decouple from each other 
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in the Gaussian model and one obtains Z, in the form of a product 
Zo a Zo> Vie (6-169) 


where Z, has no relevance in the RG transformation and can be ignored 


in determining the scaling law of the singular part of the free energy. 


(B) Rescaling. In this step one rescales in the Fourier space as 
qq = bq, (6-170) 


thereby restoring the cut-off to its original value A. 


(C) Renormalization of the order parameter. Finally, the order parameter 


is renormalized as 


1 
¢(q) = 7o<(q), (6-17 1a) 


where the scale factor z can be determined by demanding that the param- 
eter AK remain unaltered at the end of the three steps of renormalization 
transformation mentioned above. It can also be determined by working 
out the correlation function ($(qi)¢(q2)) for 0 < qi,q2 < * by referring 
to the original Hamiltonian and equating this to (¢'(bq:)¢'(bq2)) worked 
out with reference to the renormalized Hamiltonian. Both methods give 


([128], [18]) 


z= pts, (6-171b) 
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(other choices for z are also possible). 


The renormalized Hamiltonian and partition function work out to 


! qi?) ! 7 1 7 ; a 
PH = / mee °2'[5(7+ Kb °q *YId'(a/)?] — zh: bao; (6-172a) 


Fn = Bis, / Dd! (qe BHO), (6-172b) 


Thus, as a result of the momentum space RG transformation made up of 
the three steps mentioned above, the parameters 7, characterizing the 


system transform as (kK remains unaltered) 


D D 
=o sh @,=1+ 5), (6-173) 


where the exponents y,, y;, determine the eigenvalues of the linearized RG 
transformation as in (6-161). For the Gaussian model in the absence of 
a field, the RG equation predicts two fixed points at r = oo and r = 0, of 
which the former corresponds to the high temperature disordered phase 


while the latter describes the critical point (T = T.). 


The scaling of the singular part of the specific free energy is obtained as 


(refer back to (6-129)) 


to [singular] (7, h) =T ) ’ (6- 1 74) 


the gap exponent being A = 4+ 2. The above scaling formulae lead to 
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the following values of critical constants for the model, agreeing with the 


mean field results (sec. 6.2.3.4) at D = 4, 


wS., (6-175) 


We now look at the critical behavior as terms with higher degrees in the 
order parameter are included in the Hamiltonian. For the sake of con- 
creteness we consider h = 0, including only terms of even degree in the 
field ¢. It turns out that the fourth degree term introduces non-trivial 
change of behavior for D < 4 (the sixth degree term, which will not be 


considered here, turns out to be relevant for D < 3). 


It is highly rewarding to work out the RG program, including the higher 
order terms in the Hamiltonian, and treating those as perturbations over 
the Hamiltonian (6-167b) (with h = 0), where the perturbation parameter 


ise=4-D. 


More precisely, we will see below that if u denotes the strength of the quartic 
term, a new non-trivial fixed point of the RG transformation appears at 
u = u(> 0) that approaches O as « > 0*. The perturbative result for small 
e shows that there occurs a flow under the RG transformation from the 


Gaussian fixed point (7 = 0, u = 0) to this new fixed point. 


Since the unperturbed Hamiltonian corresponding to the Gaussian model 
is diagonal in the momentum representation, we refer to that representa- 
tion in carrying out the RG program of the perturbed Hamiltonian, con- 


fining our considerations to the leading relevant corrections in terms of 
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u( €). 


Thus, we start with the Hamiltonian 
OH = BH + BH,, (6-176a) 


where 3H) is the Gaussian model Hamiltonian given in (6-167b) and the 


perturbation is represented by the quartic term 


ee / dx|6(x)|! 


3 dllq, 
=u fT yb) el42)0(65) ea. aes). 61760 


One now has a system described in terms of three parameters 7, K,u on 
which the RG transformation is to be applied in the three steps ((A), (B), 
(C)) mentioned above. The calculation of the renormalized Hamiltonian 
in the first step (step (A), coarse-graining) involves a cumulant expansion 
([128], [18]) in which only the leading term in u is retained. On working 
out this leading contribution, one finds that the parameters K,u remain 
unaltered while 7 gets renormalized to 


d!\q 1 


Coaewca (6-177a) 


T3T=7+4u(d4 2) | 
> 


where /, denotes the integration over Fourier modes belonging to the 
region + < |q| < A. On now employing the remaining two transformations 
of rescaling and renormalization (steps (B), (C) mentioned earlier), one 
finally obtains a transformed Hamiltonian of the same form as the one 


we started with, defined in terms of parameters 7’, K’,u’ related to 7, K,u 
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as 
pf =b P27, K' =P 727K, wu! = bP 244, (6-177b) 


where the parameter z defines the renormalization of ¢(q) as in (6-17 1a). 


As in the Gaussian case, the choice z = b!*+? gives k' = K, while the 


transformations of t,u appear as 


7 = br + 4u(d+ 2) / dvd : ear a (6-178) 
5 (20)? 74+ Kq? 

where higher order terms in wu are ignored (recall that d stands for the 

number of independent components of ¢; in the case of the Ising model, 

d = 1; for a 3-dim vector field ¢, on the other hand, d = 3 ). One observes 

that the RG transformation (referred to as the ‘recursion relation’) for t 

gets modified in the presence of the perturbation, by a term proportional 


to the small parameter uw. 


The recursion relations for t,u are commonly written in the form of a pair 
of differential equations by assuming that 0} is only slightly larger than 


unity, say b=1+ 061. Writing r'— 7 = 67, u’ —u=odu, one obtains 


dr 4u(d + 2)KpA? 

dl ae T+ KA? 

d 

Gy = (4- Dy, (6-179) 
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where Kp is a constant defined as 


Sp 
Kp = =| 
Qn? 
Sp = , (6-180b) 
I) 


being the surface area of a D-dimensional sphere of unit radius. 


In (6-179), one identifies a single fixed point (7 = 0,u = 0) for which y, = 
2,y, = 4-—D. The results at this order of perturbation do not reveal 
anything novel since the fixed point is the one found in the Gaussian 
model (though now with two eigenvalues - one of those being zero — and 
two eigen-directions). Going over to the next higher order of perturbation 


one obtains the result 


dtr 924 A(d oa 2)KpAPu 

ad rt KR 

du 4(d+8)KpA? , 

a eT (7+ KM? a (6-181) 


One now finds that, in addition to the fixed point 7 = 0,u = 0, which 


corresponds to the one found in the Gaussian model, there arises a new 


fixed point for D < 4(« > 0) att = 0, u = = (« = 4—D), where B is a positive 


constant (in contrast, for D > 4, there is only one single fixed point and 
a single universality class — that determined by the mean field theory). 
Since, strictly speaking, the perturbation theory is expected to work only 
for small u, one can equivalently identify the perturbation parameter as 


¢ =4—D. The calculations become increasingly complex at higher orders 
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and do not converge indefinitely; still, one can extract meaningful and re- 
vealing results for D differing significantly from the value 4 by employing 


suitable resummation techniques. 


Up to order «, the new, ‘Heisenberg fixed point’ is at 


K?2 


= “Tay 8) KpAo (6-182) 


The Gaussian fixed point, which was unstable to start with (with respect 
to r) for « < 0, continues to be unstable (with respect to both u and 1) 
for « > 0, while the Heisenberg fixed point is stable with respect to u and 
unstable with respect to 7, with y, © 2, y, + —e. Starting from any point 
on the line joining the two fixed points, the RG transformation results ina 
flow to the Heisenberg fixed point. Thus, all Hamiltonians characterized 
by parameter values on this line belong to the same universality class, 
distinct from that of all others around the Gaussian fixed point. The 
nature of the RG flow in the u-7 plane is shown schematically in fig. 6-21 
for « < 0 (D > 4) and « > 0 (D < 4). One notes that the positions of the 
fixed points in the parameter space depend on microscopically defined 
parameters such as K,A, but the growth rates y,, y,, depend only on the 


universality class defined in terms of d, D. 
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yr, Uu 
Ss H 
(B) 
Figure 6-21: Fixed points and RG flow (schematic) for the model (6-176a), (6-176b), 
in the u-r parameter space for (A) « < 0, and (B) « > 0. In (A), there is only one fixed 
point with one unstable and one stable eigen-direction while in (B) a fixed point marked 
‘H’ (the so-called Heisenberg fixed point) appears in addition to the one marked ‘G’ 
(the Gaussian fixed point) for which both directions are now unstable; the former is 


characterized by one stable and one unstable direction, implying a new universality 
class for Hamiltonians close to the quartic model; based on [18], fig. 5.8.3. 


u 


(A) 


6.5 Disordered systems: spin glass 


6.5.1 Spin glass: introduction 


While introducing the statistical mechanics of disordered systems it is to 
be noted that the term ‘order’ may have different connotations in various 
different contexts. Thus, with reference to a crystalline solid, the term 
denotes a regular geometric pattern underlying its structure. The Ising 
model studied in 6.2 is based on a spin-spin interaction that is char- 
acterized by a similar regularity, and can therefore be referred to as an 
ordered system. However, the underlying structural regularity need not 
necessarily mean that the interaction between the spins shares the same 
feature. In the present section, we will once again look at Ising type sys- 


tems where, however, the spin-spin interaction is disordered, described 
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by one or more random variables, in spite of a structural regularity. Put 
differently, the order in the underlying structure and the order character- 
izing the interaction between the constituents of a system are indicative 


of distinct concepts. 


In a mean field Ising model, however, the structural regularity has no rele- 


vance since all spins are assumed to feel an identical environmental effect. 


Finally, the term ‘order’ stands for a thermodynamic parameter distin- 
guishing between distinct phases of a system. For instance, the order 
parameter in the case of a regular Ising model specifies the mean mag- 
netic moment at any given lattice site, distinguishing between the param- 
agnetic and ferromagnetic phases (depending on the nature of the spin- 
spin interaction, it is also possible for an Ising type model to admit of an 
anti-ferromagnetic phase). In the present section on disordered systems, 
we will again speak of an order parameter, but one that is indicative of a 
new and subtle complexity in the phase transition in consequence of the 
disordered nature of the interaction while the underlying structure may 


once again be an ordered or regular one. 


When a liquid is cooled rapidly under appropriate conditions, it gets intoa 
glassy state, where the randomness is positional; though the constitution 
resembles that of a solid, there is no underlying crystalline structure. 
The glassy state is, strictly speaking, not an equilibrium thermodynamic 
state of the material under consideration since extremely slow diffusional 


processes continue to take place. 
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In this book we will focus on disordered systems with characteristics 
closely resembling those of materials in the glassy state — ones generally 
referred to as spin glasses. In particular, we will consider classical spins 
distributed discretely in space (analogous to the Ising system considered 
in section 6.2), with interactions that vary randomly between the various 
possible spin pairs. In contrast to the glassy state, the spins have po- 
sitional order but are characterized by orientational disorder of various 


descriptions. 


As in sec. 6.2, a spin will be assumed to be a two-state object described 


by a configurational variable s that can assume values, say, s = +1 where, 
unlike a quantum mechanical spin, superposed states are not admitted. 
More generally, one may consider Heisenberg type spins where a spin is 
characterized by a vector variable s having a fixed magnitude, described 
by two continuously varying configurational variables specifying its spatial 
orientation. As in an Ising system, the spins are assumed to be located 
at fixed positions in space. The dimension of space accommodating the 
lattice may be notionally taken to be any positive integer D for the sake of 


generality. 


Spin glasses may correspond to real-life systems, such as dilute magnetic 
alloys where magnetic impurities get implanted in the lattice structure of 
a non-magnetic metal (such as, for instance, manganese in copper). At 
sufficiently low temperatures, the strength of magnetic interaction be- 
tween the impurity atoms turns out to be randomly distributed, with a 


competition between interactions of ferromagnetic and anti-ferromagnetic 
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types, and is found to engender a number of notable consequences, in- 
cluding ones that point to the possibility of a phase transition of a novel 
kind. Experimental observations, however, do not lend themselves to 
clear and unambiguous interpretations, the fundamental reason under- 
lying the ambiguity being a lack of clear indication as to whether the 
system under consideration attains a state of true thermodynamic equi- 
librium or is in a dynamic configuration with an enormously long equi- 
libriation time. Other than the dilute magnetic alloys, a large variety 
of systems have now been found to exhibit a common set of behavior 
patterns that has come to be identified as typical spin-glass properties. 
Several of these systems are not of the magnetic type, which means that 


the term ‘spin’ does not necessarily denote a magnetic moment. 


In the present introductory exposition we begin by considering the equi- 
librium statistical mechanics of an Ising type model of spin glass referred 
to as the Edwards-Anderson (EA) model. In this model the spins are as- 
sumed to be located at the vertices of a regular lattice, and each spin is 
assumed to interact with its neighbors by means of a short-range inter- 
action where the strength of the interaction varies randomly over relevant 
spin pairs and involves a competition between interactions of ferromag- 
netic and anti-ferromagnetic types. The EA model is believed to incorpo- 
rate the basic features of spin glasses so as to be sufficiently represen- 
tative of the materials studied experimentally while being, at the same 
time, sufficiently simple, at least in principle. On the flip side, it resists 
solution in the sense that closed expressions cannot be worked out for 


the thermodynamic functions, and a phase transition from the param- 
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agnetic to a glassy phase cannot be established with sufficient precision. 
Thus, the EA model and similar other models with short-range interac- 
tions are, at present, objects for numerical investigations. In contrast, 
the mean field approach to EA type models has proved to be of immense 
value in leading to a wide array of novel results and methods in spin glass 
theory as also in associated areas in the physics of phase transitions, in 


mathematics, and in the inter-disciplinary science of complex systems. 


The content of the present section on spin glasses is based, in the main, 
on [37], [10], and [102]; [132] is a remarkably lucid, readable, and instruc- 
tive introduction to the subject of spin glasses and their relevance to com- 


plex systems. 


In the case of a dilute magnetic alloy one finds that, typically, the mean 
magnetization in the absence of a magnetic field is zero even at low tem- 
peratures as in the case of a paramagnetic material, but the temperature 
dependence of the susceptibility departs from the Curie law and involves 
a cusp as shown in fig. 6-22. In this figure, the real part of the AC 
susceptibility (,/(w)) is shown as a function of temperature at a suffi- 
ciently low frequency such that one can compare the graph against the 
static susceptibility extrapolated to low temperatures, where an enhance- 
ment with a cusp is observed at a temperature that we denote by 7; (the 
‘freezing’ temperature, at which the spins are supposed to get frozen into 
random orientations). The latter marks the onset of what appears to be 
a phase transition analogous to the ferromagnetic or anti-ferromagnetic 


phase transition, but one that has a number of distinctive and remark- 
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able features. Incidentally, experimental data on the AC susceptibility 
at low frequencies, from which the static susceptibility is obtained by 


extrapolation, appear to indicate that the cusp is slightly rounded. 


xX 


T; 


Figure 6-22: Depicting the variation of magnetic susceptibility with temperature (T) 
of a dilute magnetic alloy answering to the description of a spin glass (schematic); the 
variation of susceptibility departs from the Curie law for a paramagnetic substance and 
exhibits a cusp at a certain temperatute 7; (the ‘freezing’ temperature) indicative of a 
phase transition to a spin-glass phase which, however, is distinct from a ferromagnetic 
transition; the mean magnetization M in zero field is zero both below and above T;, but 
the transition is characterized by the appearance of a non-zero order parameter gq of a 
novel type; more specifically, the order parameter is a function P(q) (the ‘Parisi order 
parameter’), symptomatic of a subtle and complex ‘ergodicity breaking’; experimental 
data are consistent with a slight rounding of the cusp; as T is lowered below T;, more 
and more spins are frozen into fixed but random orientations; the data relate to the real 
part of the AC susceptibility at low frequencies, from which the static susceptibility is 
obtained by extrapolation. 


The question arises as to whether the cusp at T = 7; is indicative of a true 
phase transition and, if so, whether the transition is similar in nature toa 
ferromagnetic or anti-ferromagnetic change of phase. Experimental data 
do not discriminate clearly between a phase transition from an equilib- 
rium thermodynamic state above 7; to one below 7J;, and a transition 
from an equilibrium state to a metastable state with an enormously long 
lifetime. Observations reveal conflicting features when looked at from a 


traditional perspective that offers a clear and unambiguous characteriza- 
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tion of a ferromagnetic or anti-ferromagnetic transition. In this context, 
it becomes important to look at models that are conceptually simple and, 
at the same time, incorporate features of real-life systems that may be 
looked upon as ones of basic relevance in the context of the transition 


under consideration. 


The Edwards-Anderson model is widely believed to be one answering to 
these demands. As already mentioned, it is based on short-range inter- 
actions whose strength and polarity (in the sense of whether the inter- 
action favors the spins of a pair to be aligned parallel or anti-parallel to 
each other) vary randomly over the various interacting pairs. There is 
quite considerable evidence that these are features of basic relevance in 
real life systems that exhibit a common set of characteristics relating to 
the transition mentioned above, these being precisely the characteristics 
that allow one to classify all such systems under the single label of ‘spin 
glass’. At the same time the model is conceptually simple and clearly de- 
fined where there is little ambiguity as to how the methods of equilibrium 
statistical mechanics are to be employed in deriving the thermodynamic 


properties of a spin glass. 


It turns out that, in spite of its conceptual simplicity, the EA model is 
in fact a highly non-trivial one that resists an analytical derivation of the 
thermodynamic properties and of the purported transition from a disor- 
dered (‘paramagnetic’) phase to one of a frozen disorder (‘spin-glass’). On 
the other hand, a modification of the model involving infinite range in- 
teractions among the spins, introduced and studied by Sherrington and 


Kirkpatrick (the SK model) leads to remarkable conclusions regarding the 
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freezing transition though it lacks the structural order of the EA model. 
Central among these is the one relating to an infinite-fold ergodicity break- 
ing, closely related to the breaking of the replica symmetry as it is referred 
to in the literature. The breaking of replica symmetry in the SK model is 
associated with the appearance of a new type of order parameter identi- 


fying the freezing transition, namely the Parisi overlap function. 


In sections 6.5.2, 6.5.5 below, we set up the EA model of spin glass and 
introduce the idea of the replica trick for the working out of the thermo- 
dynamic functions in spin glass models. The replica trick does not yield 
analytical results in the EA model but turns out to be very fruitful in the 
case of the SK model. We briefly sketch a number of results relating to 
the latter in sec. 6.5.7 where we will see that, above the transition tem- 
perature 7; that can be worked out explicitly in terms of the parameters 
describing the model, there is a disordered phase with one single order 
parameter (the ‘self-overlap’ q) that has a value zero. Below the transition, 
on the other hand, the self-overlap is not sufficient to describe the sym- 
metry breaking, and one needs the overlap function to correctly describe 


the transition. 


6.5.2 The Edwards-Anderson model 


The EA (Edwards-Anderson) model with Ising type spins is described by 


the Hamiltonian 


1 
ij a 
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In this expression, the sub-indices i,j stand for locations of sites in a 
lattice where,in the case of a 3D lattice, each of these corresponds to a 
triplet of integers. The notation (ij) means that the summation over lat- 
tice sites is to run over all nearest-neighbor pairs and the factor of $ ac- 
counts for the fact that each pair of neighbors is to be included only once 
in the sum. The coupling between spins at locations 7,7 is described by 
the coefficients J;;, symmetrical in i,j (Ji; = Jj:), where J;; is assumed to 
represent a random variable with some specified probability distribution 
(see below). Finally, the parameter h represents the effect of an external 


field acting uniformly on each spin. 


Comparing the notation in (6-183) with that in (6-2), one notes the use of 


the variables s; in place of o;; the latter are the spin variables at the vari- 


ous sites while the former stand for values (s; = +1) of these variables. In 
other words, (6-183) represents the energy of any specified spin configu- 
ration rather than the Hamiltonian; we will, however, often disregard the 


distinction. 


We assume the probability distribution of J;; to be a Gaussian one, given 


by 
1 IP 
Pl Jig) = Vana? exp DA?’ (6-184) 


with mean zero and with width A (variance= A’) which we assume to be 


the same for all interacting pairs. 


Any possible value (within the range —oco to +oo for the Gaussian dis- 
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tribution we have assumed here) of the coupling J;; implies a spin-spin 
interaction that can be either of the ferromagnetic (J;; > 0) or the anti- 
ferromagnetic (J;; < 0) type where the two types correspond to a prefer- 
ence for parallel (ferromagnetic) or anti-parallel (anti-ferromagnetic) ori- 


entations of the spins at sites i, 7. 


1. While we have considered Ising type spins in (6-183) where each spin 
is a two-state object (‘spin up’ and ‘spin down’) one can consider ‘spins’ 
of more general description (such as the Heisenberg spins mentioned 
above). Ising spins have been chosen for the sake of conceptual sim- 
plicity. 

2. Probability distributions for the couplings J;; other than Gaussian may 


also be considered. For instance, a bimodal distribution with equal 


probabilities for two specified values (which we denote by +A) gives 


results analogous to those resulting from a Gaussian distribution. 


Formulas (6-183), (6-184) describe only a particular instance of the EA 
model which is more general in scope. As mentioned above, the ‘spins’ 
may be objects of a more general description instead of being of the the 
simple Ising type. The nearest neighbor interactions may be generalized 
to include short range interactions with strengths decaying rapidly with 
the distance between the spins. Moreover, the distribution of the cou- 
pling strengths /;; may depend on the site indices 7,7, and the mean Jj; 
need not be zero. Keeping in mind the question of correspondence of 
the model to actual physical systems, it is not necessary for the spins 


to be located on all the lattice sites since one may consider models with 
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‘site disorder’ (where each site has some specified probability of being 
occupied by a spin) as well as those with ‘bond disorder’(eq. (6-184); ran- 
domly distributed coupling strengths), or even ones involving both types 
of disorder. Site disorder models are commonly grouped as a separate 
class distinct from EA type models, though Edwards and Anderson in- 
cluded site disorder in part of their considerations in the pioneering pa- 
per (see [102]) that, in a manner of speaking, inaugurated the era of spin 


glass theory. 


From a more general point of view, models with long range interactions 
may be considered to be extensions of the EA model, an important in- 
stance being the SK model. In sec. 6.5.8 we briefly outline the Thouless- 
Anderson-Palmer (TAP) approach to the SK model, which is essentially an 
adaptation of the mean field theory to the case of randomly distributed 
interactions, where the latter introduces a great deal of complexity into 


the man field approach. 


In all the variants of the EA model the global spin flip symmetry holds, 
i.e., the Hamiltonian remains invariant under the change of sign of all 
the spins in the system, as is trivially the case with (6-183). The same 
symmetry holds for a regular Ising model as well and is relevant in the 
definition of the order parameter in the phase transition from a param- 
agnetic to a ferromagnetic state, which involves a spontaneous breaking 
of this symmetry. Indeed, every equilibrium state in the regular Ising 
model is associated with one obtained by the application of the global 
spin flip operation. In the paramagnetic phase, the two states are iden- 


tical (i.e., the symmetry is not broken), while in the ferromagnetic phase, 
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the spin-flipped states occur in pairs. Broken spin flip symmetry chooses 
one of the two states of a pair, depending on the boundary condition or 


the presence of a uniform magnetic field of arbitrarily small strength. 


The nature of symmetry breaking in the spin glass transition is more 
subtle, as is the order parameter appropriate for describing the multi- 
ple equilibria associated with the spin-glass phase. These will be briefly 


touched upon in sections 6.5.6, 6.5.7, and 6.5.8 below. 


6.5.3 ‘Frustration’ 


A feature of crucial relevance in the EA model and indeed, in any spin- 
glass model, is the one referred to as frustration. Considering any partic- 
ular spin in the lattice, the signs and magnitudes of the couplings with 
its nearest neighbors may exert conflicting effects on it since some of the 
couplings may be of ferromagnetic and some others of anti-ferromagnetic 
types. Alternatively, considering a closed circuit made up of a number of 
contiguous spins as in fig. 6-23, one can fix any one chosen spin as, say 
‘up’ (s = +1) and, moving along the circuit in any given sense (bent dotted 
arrow in the fig.; we consider a 2D lattice for sake of simplicity) determine 
the spin (+1 or -1) at each site encountered in succession in accordance 
with the sign of the coupling with the immediately preceding spin, where 
the spin encountered last in the circuit may become indeterminate due 
to conflicting requirements. This is seen in the figure where three of the 
four couplings are positive while the remaining one is negative. Starting 
with the site marked ‘1’, one finds that the spins at sites marked ‘2 and 


‘3’ have to be +1 each, while that at site ‘4’ becomes indeterminate since 
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Jz, and J,4, exert conflicting influences on it. This, in a nutshell, is the 
phenomenon of frustration, which is a basic feature of complex systems 


in general. 


Referring to any circuit of the type described above, the conflict between 
mutually exclusive requirements shows up if an odd number of the cou- 
plings happen to be negative. It is apparent that, in an infinitely extended 
lattice, conflicts of the type indicated above arise for a very large number 
of spins, and the thermodynamic state at a sufficiently low temperature 
(where the tendency of minimum energy dominates over that of entropic 


disordering) has to depend on extremely subtle ordering effects. 


ee oe 
1° 4, 
a 
Jn <0 iJ, > 0 
ee $, 


Figure 6-23: Illustrating the phenomenon of frustration in a spin glass; a closed cir- 
cuit made up of sites marked ‘1’, ‘2’, ‘3’, and ‘4’ is considered (other sites are not shown) 
in a 2D lattice, with the spin at site ‘1’ chosen to be +1; the signs of the couplings be- 
tween the spins at contiguous sites are chosen as shown; moving in the sense shown by 
the bent dotted arrow one finds that the spins at sites ‘2’ and ‘3’ are required to be both 
+1 (minimization of interaction energy; we assume the temperature to be sufficiently 
low), but the spin at site ‘4’ becomes indeterminate due to mutually conflicting effects 
of spins at ‘1’ and ‘3’. 


6.5.4 Quenched disorder: the idea of self-averaging 


Considering a system described by (6-183) for any particular realization 
of the set of random variables /;; for all the interacting spin pairs, we 


have a spin glass characterized by a quenched disorder, where the term 
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‘quenched’ signifies that the disorder does not evolve with time and is 
fixed once and for all. On the other hand, considering some other real- 
ization of the variables J;; we get another system with quenched disorder, 
one distinct from the former system in details, but not in its statistical 
description. If now one works out any of the extensive thermodynamic 
variables, e.g., the free energy (or, more precisely, the free energy per 
site), for the two systems by invoking the respective canonical ensembles 
at any arbitrarily chosen temperature T = (kp(3)~!, will the two sets of 


results agree? 


Common sense tells us that they must, since the thermodynamic limit is 
supposed to describe an infinitely large system where the differences in 
detail for the two systems referred to are not expected to matter. In other 
words, what matters in the thermodynamic limit is the statistical distri- 
bution of the couplings J;; and not their actual values for specific choices 
of spin pairs, since possible variations in these actual values are likely to 
get averaged out between large chunks of an infinitely large system. Put 
differently, large chunks of a very large system can themselves be looked 
upon as different realizations of the distribution of the /J;;’s, and provide 
for the necessary averaging in the thermodynamic limit. In this sense, 


thermodynamic variables are expected to be self-averaging. 


The above qualitative argument leading to the idea of self-averaging can be 
made rigorous, provided that the random interaction between the spin is a 
short-range one - a requirement satisfied in the EA model which assumes 
a nearest-neighbor interaction. In the case of a long-range random inter- 
action, one can establish that self-averaging still holds provided that the 


strength of interaction between any two spins falls off at least like the in- 
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verse separation between them raised to the power 2, where D stands for 


the dimension of the space in which the lattice is imagined to exist (D = 3 


for ordinary space); for a proof see [142]. 


The Sherrington-Kirkpatrick model discussed in sec. 6.5.7 below is an infinite- 
range one where the interaction strength between any two spins is indepen- 
dent of their separation, but its variance is scaled down by the system size 
(N). This, once again, implies that the free energy is a self-averaging quan- 
tity in this model. However, certain physical quantities remain non-self- 
averaging which is actually indicative of a subtle and essential distinction 


between regular and random systems. 


In working out a thermodynamic quantity such as the free energy density 
f = lim N > o0*8, where Fy denotes the free energy for a system with N 
spins, one has to allow for the fact that Fy will, in general, depend on the 
particular realization of the disorder pertaining to the system. Denoting 
a realization by the symbol 7, one is led to consider the quantity Fy (7) 
depending on 7, and is then required to average over all possible 7’s so 


as to obtain 


Fy = |Fr(S)Jav = Fu(J), (6-185) 


where the notation [-|,, or, equivalently, the overline, is used to denote an 
averaging over various possible realizations of the disorder. The thermo- 
dynamic free energy density is then obtained by going over to the limit 


N — ov, relying implicitly on the validity of the principle of self-averaging. 
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Here we have referred to the free energy since it is the thermodynamic 
variable based on which other thermodynamic variables can be worked 


out. The free energy, in turn, is obtained from the partition function as 


Fy(J) = —8-' In Zn(J), (6-186a) 


where Z)7 involves a sum over spin configurations 


Zn(J) = Tris} exp (—BHn(S, J)). (6-186b) 


In this expression, Hy(S, 7) stands for the value assumed by the Hamilto- 
nian (6-183) for a spin configuration S and a realization 7 of the disorder, 
given that the system size is NV. The notation Tr{S} denotes a sum over 
all possible spin configurations S. The formula (6-185) then requires that 
one has to work out the average of In Zy(7) over the disorder 7, which 
is extremely difficult to accomplish. Instead of calculating the average of 
the logarithm, it is relatively easy, at least in principle, to work out the 
average for Zy(7) or any integer power of it. This leads to the idea of the 


replica trick, as outlined in sec. 6.5.5 below. 


In this context, it is important to look at the physical basis of assuming 
that the disorder is of the quenched type. For a large class of disor- 
dered systems - ones that make up the category referred to as spin glass 
- it is found that results arrived at under the assumption of quenched 
disorder conform well to experimental observations. Generally speaking, 
the disorder is expected to evolve in time since, starting with the sys- 


tem in a non-equilibrium configuration, transport processes of various 
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descriptions are likely to occur, leading asymptotically to an equilibrium 
configuration. The idea of a quenched disorder can be meaningful only 
when the time scale of evolution of the disorder is large compared to the 
time of observation over which the equilibrium is arrived at, i.e., when the 
equilibrium is established with what can be considered to be a frozen-in 
disorder. In this case the calculation of the configuration sum in the par- 
tition function can be worked out independently of the averaging over the 


disorder which can be performed at a subsequent stage. 


In contrast, the disorder in a physical system can also be of the annealed 
type where the time scale of the evolution of the disorder is comparable 
to the time of observation over which equilibriation sets in, in which case 
the segregation of the two averaging processes (one over configurations 
and the other over realizations of the disorder) is no longer valid. We will 
leave this type of disorder out of consideration and, instead, will focus on 


quenched disorder alone. 


6.5.5 The replica trick 


The replica trick is based on the identity 


(6-187) 


oo eS 
Inz = lim 
n-0 n 


Recall that, for sufficiently small n, 2” = e"™* x 1+nInz. 
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This gives, for finite V, and any realization of the disorder 7, 


In Zy(J) = lim =(ZR(J) — 1). (6-188a) 


n-0n 


Averaging over the disorder,one obtains 


InZy = lim ~(ZR — 1). (6-188b) 
For any positive integral value of n, it is, in principle, easier to work out 
Zn as compared to a direct evaluation of InZy. One considers n number 
of copies, or replicas, of the system, all with the same realization (7) of 
the disorder and the same number of particles (V), and then evaluates 
the partition function of the composite system made up of the n replicas, 
where the replicas are considered as independent (i.e., non-interacting) 
subsystems, thereby obtaining 7%(7). Averaging over the disorder (see 
sec. 6.5.7 for a concrete example) one arrives at Z%., for any arbitrarily 


chosen positive integer value of n. 


In evaluating 7%(7) of the composite system made up of the n number of 
replicas, one has to sum over all possible spin configurations of each of 
the replicas, where the spin configurations contribute independently to 
the sum. On performing the averaging over the disorder, one finds that 
the problem gets converted to the evaluation of the partition function 
of a system of n number of interacting spin systems, but now without 
disorder, with some effective Hamiltonian H, ie the explicit form of which 


can be worked out from the theory. 
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With Zy(7) given by (6-186b), one has 


ZN(TJ) = Trps}.... 450} EXP [-8 S- Hy(s™, Cale (6-189) 

a=1 
where the various different replicas are distinguished by super-indices 1 
to n, and S‘ stands for a spin configuration of the replica with index 
a (a = 1,2,---,n). The configuration sum Trysi)}.,...,,s} 1S NOW over the 
spin configurations of all the n number of replicas, which we will write in 
brief as Tr;;5}). In the case of a Hamiltonian of the form (6-183), one can first 


express Z%,(7) as expln Z%,(7) and then invoke the cumulant expansion so 


as to obtain 7%(7) in the form of the exponential of a series expansion in 


terms of the cumulants of the random variables Jj;;. 


In the case of a disordered system with an Ising EA type Hamiltonian (6-183), 


one obtains (see [10]) 


a=l 


Wd 3 net ull nO: (2) ,(@)) (6-190) 
oe 


where we have assumed the external field to be absent (h = 0) for the sake of 
simplicity and denoted the /th cumulant (I = 1,2,---) of the distribution for 
Ji; by Jj;,|l]. In this expression, the interaction between the spins need not be 
confined to nearest neighbors. One observes that the effective Hamiltonian 


involves couplings between the different replicas. 


The special feature of a Gaussian distribution is that only the first two cu- 
mulants are non-zero, while the higher cumulants are all zero. Moreover, 
for a Gaussian distribution with zero mean, the only non-zero cumulant is 
the one of order two, being equal to the second moment of the distribution. 
In other words, the cumulant expansion greatly simplifies the calculation, 


in the replica method, of the average Z”, in the case of a Hamiltonian of the 
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form (6-183), with a Gaussian distribution of the J;,;’s. 


Once the average Z” is calculated as the partition function pertaining to 
A i it is then to be continued analytically to small real values of n so as 
to evaluate the limit n — 0, after which one has to finally go over to the 
thermodynamic limit (NV — oo): 
lim ae = li oti (Te 1). (6-191) 
N-y00 N-r0o N n>0n 
In practice, it is advantageous to take the limit N — oo before the limit 
n — 0 by invoking an asymptotic method such as the method of steep- 
est descent. While questionable in principle, there is ample indication, 
based on a number of investigations, that this procedure does not lead to 


erroneous conclusions (refer to [10]). 


The limit n — 0 introduces a subtle complexity in the entire theory, and 


brings in the question of replica symmetry breaking mentioned earlier. 


6.5.6 Spin-glass phase transition and order parameter 


The randomly varying coupling strengths between interacting spin pairs 
in eq. (6-183), with positive and negative coupling strengths correspond- 
ing to ferromagnetic and anti-ferromagnetic interactions (favoring, re- 
spectively, parallel and anti-parallel spins) is a feature of central rele- 
vance in the EA model and other interesting spin glass models. It is 
this feature that is responsible for ‘frustration’ in the system: individual 


spins feel conflicting effects due to a competition between the two types 
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of interactions. 


The physical origin of the competition between interactions of ferromag- 
netic and anti-ferromagnetic types may be diverse for various different 
spin glass materials. One common explanation of this phenomenon is 
based on what is referred to as the RKKY (Ruderman-Kittel-Kasuya-Yosida) 
effect. When conduction electrons are scattered by magnetic moments lo- 
cated at lattice sites, there appear spatially localized and alternating re- 
gions of oppositely oriented spin polarization by interference effects, and 
the interaction between two magnetic moments depends on which zones 


the two corresponding sites belong to. 


As mentioned earlier, the EA model is designed to capture the essential 
aspect of a random competition between interactions of opposite types 
between ‘spin’ pairs located at lattice sites. With the simplest Ising type 
version of the model given in (6-183) and (6-184), one can carry out the 
replica calculation outlined in sec. 6.5.5 and obtain (assuming the exter- 


nal field parameter h to be zero for the sake of simplicity) 


eH = “ir LLMs, (6-192) 
(refer to (6-190) and to [10]). Here a,y run from 1 to n and7,7 run from 1 
to N with the constraint that they correspond to nearest neighbor sites. 
However, as mentioned earlier, the replica method works in more general 
contexts as well, and one can consider interactions of arbitrary range (in 
particular, the SK model outlined in sec. 6.5.7 involves an interaction 


of infinite range), and ones with the mean (Jij), and variance (2 — aoe 
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depending on 7,7, and also with h 4 0. 


One observes in (6-192) that, while the disorder has been averaged out in 
arriving at the effective Hamiltonian H, yy (which is now translationally in- 
variant), the latter involves an interaction between pairs of replicas, which 


constitutes the price at which the disorder has been gotten rid of. 


Edwards and Anderson did not restrict their considerations to nearest 
neighbor interactions alone, and looked for general features pertaining 
to a possible spin-glass phase transition (the ‘freezing transition’ referred 
to above) in which each spin gets frozen in some fixed orientation, with 
the latter varying randomly from one spin to another. An equilibrium 
phase with spins frozen in randomly varying orientations is one with very 
special characteristic features compared to a ferromagnetic phase where 
the mean magnetic moment per site (m) acts as the order parameter. 
Because of translational invariance of the regular Ising Hamiltonian, one 


has, for any specified site i and for sufficiently large size N, 
1 
m= > (si) = (s;). (6-193) 
J 


In the case of a pure phase, m is a thermodynamic variable having a 
well defined value at any specified temperature below the transition. The 
global spin flip symmetry of the Hamiltonian implies that the pure phases 
occur in spin-flipped pairs with order parameters +m, and the occur- 
rence of either member of a pair in any particular sample constitutes an 
instance of broken symmetry. In the case of a random interaction, on the 


other hand, there is no translational invariance, though the global spin 
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flip symmetry continues to be there. Thus, (s;) 4 0 for almost every site 
considered without reference to other sites, but varies randomly from site 
to site, there being no long range order as in the case of a ferromagnet; 
in other words, m = + )°,(s;) is expected to be zero in the thermodynamic 
limit. A more appropriate description of the spin-glass phase would be in 
terms of the Edwards-Anderson order parameter qz, (commonly abbrevi- 


ated to q) 


dea = Jim ~ dls)’, (6-194) 
where we implicitly assume self-averaging to hold. One expects gga to be 
zero in the paramagnetic phase (i.e., above the transition temperature; 
this has been referred to above as the freezing temperature 7;) and to 
be non-zero in the spin glass phase where spins are frozen in random 
orientations. Once again, the breaking of global spin flip symmetry is 
expected to lead to a pair of pure spin glass states (or, more properly, 


pure phases), while mixed states occurring, once again, in pairs, are also 


possible as in the regular Ising model with D > 2. 


Numerical and theoretical results on the nearest neighbor EA model do 
indeed indicate the existence of a phase transition (depending, however, 
on the dimension D of the space in which the lattice of spins is imagined 
to exist), but the structure of pure states below the transition remains 
elusive. In particular, one encounters the possibility of a large number of 
pure states that cannot be described in terms of a single order parameter 


be qza. 


683 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


Phase transitions appear only in infinitely large systems, for which the 
question of characterization of equilibrium states is a non-trivial one (for 
instance, the Hamiltonian of an infinitely large system is not well de- 
fined), and has been briefly addressed in sections 5.6, 6.2.5. An equilib- 
rium state is one that satisfies the DLR condition, with reference to which 
one can conveniently characterize a phase transition in an infinitely large 
system. The transition occurs at a certain temperature (which we con- 
tinue to denote by 7;) such that, for T > 7; there is only one single state 
satisfying the DLR condition while, below 7; the DLR condition is satisfied 
by more than one states, among which there exist a number of extremal 
states, also referred to as pure states (once again, the term pure ‘phases’ 
is more appropriate). For instance, in the case of the ferromagnetic tran- 
sition, there are precisely two extremal states related to each other by the 
global spin flip symmetry. All other states satisfying the DLR condition 
are referred to as mixed states, and are convex linear combinations of the 
extremal states with weights that vary from one mixed state to another. 
In the case of a spin glass, all pure and mixed states below the transition 


occur in pairs in virtue of the (broken) spin flip symmetry. 


As indicated parenthetically in the above paragraph, the terms pure and 
mixed states used above are not to be confused with the ones referring to 
pure and mixed states in the phase space of a system (refer to section 1.3; 
we consider here the classical description of a system) where a pure state 
corresponds to a single point in the phase space, and a mixed state to a 
probability distribution. In order to distinguish between the two usages, we 


employ the terms pure and mixed phases in describing the scenario relat- 
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ing to a phase transition — while there is a unique (pure) phase above the 
transition temperature, there arise pure and mixed phases below the tran- 
sition, a mixed phase being a convex linear combination of pure phases. 
All the pure and mixed phases are, generally speaking, mixed states, cor- 
responding to probability distributions in the phase space. The term state 
is employed to denote a pure or mixed phase provided that the context pre- 


cludes the possibility of confusion. 


Since there is only one single phase above the transition (for which the 
spin-flip gives back the same state), it involves a distribution over all pure 


states in the phase space, and the system is ergodic. 


In this context, recall that ergodicity is a property that is to be understood 
with reference to the specification of the state of system in terms of its en- 
ergy (U) and size (the number of lattice sites, NV, in the case of a spin system) 
and, moreover, refers to its dynamics . While specifying the Hamiltonian (or 
a class of Hamiltonians) for a spin glass we have, till now, made no refer- 
ence to its dynamics. One can, for instance, refer to the Glauber dynamics 
which involves the flipping of one spin at a time. Ergodicity implies that the 
equilibrium phase corresponds to an invariant distribution over the entire 


phase space. 


Below the transition, on the other hand, the existence of numerous pure 
phases is indicative of broken ergodicity, where each pure phase corre- 
sponds to an invariant distribution over only a part of the phase space. 


In the case of a ferromagnet, for instance, there are two pure phases 


685 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


corresponding to disjoint parts of the phase space related to each other 
by the spin flip symmetry, and broken ergodicity is thus associated with 
broken symmetry. The order parameter distinguishes between the two 


pure phases. 


Numerical and theoretical investigations all point to the remarkable fea- 
ture of the spin glass transition that there occurs a large number of pure 
phases, possibly infinite, as one crosses 7; from above. The phase space 
breaks up into a large number of ergodic components, each of which ac- 
commodates one single pure phase. However, this entire scenario of the 
phase space breaking up into components refers to an infinitely large 
system. For a large but finite system, the microscopic dynamics decrees 
that the phase space is made up of regions that act effectively as traps 
for the system: once it gets caught into a trap the system takes a very 
large time to come out of it, when it is arrested in another trap. This is 
conveniently expressed by saying that the phase space is made up ‘val- 
leys’ representing minima of the free energy (a functional of the phase 
space co-ordinates) separated by ‘barriers’. Among the valleys, some rep- 
resent local minima of the free energy while some others represent global 


minima. 


Such a description is appropriate for a specification of states in terms of 
temperature and system size rather than the energy and size (recall that 
the latter specification remains valid for a finite system while the former 


requires, strictly speaking, an infinite one). 


As the system size (NV) is made to go to infinity, the free energy ‘land- 
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scape’ gets modified, with the barriers between the global minima be- 
coming infinitely high and the total number of local and global minima 
proliferating to infinity exponentially in N (the number of global minima, 
however, does not increase exponentially). The global minima now corre- 
spond to the ergodic components of the entire phase space, disjoint from 
one another. These ergodic components occur in spin-flipped pairs, but 
the various different pairs are not related to one another by any well de- 
fined symmetry operation. Put differently, the transformation from one 
pair of ergodic components to another involves local operations on spins 
rather than a global one, where these local operations are not known be- 
forehand . A closely related fact is that the various pure phases cannot 
be distinguished by a single order parameter such as gpa. As we will see 
in connection of the SK model, the order parameter identifying the er- 
godic components is of a more complex nature (represented by the Parisi 
overlap function), that can be understood in terms of what is referred to 


as replica symmetry breaking. 


The statements made in the above paragraphs are admittedly of a quali- 
tative nature, and have been included so as to present, in simple terms, 
a number of basic facts relating to the spin glass phase transition. More 
technical and complete presentations are to be found in [10], [37], and 
in [32]. The features mentioned here will again be taken up in sec. 6.5.8 


and 6.5.9 


These aspects of the spin-glass transition are mostly confirmed in numer- 
ical investigations on the nearest-neighbor EA model and similar other 


models based on short range interactions. The occurrence of a phase 
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transition with a complete description of the spin glass phase has not 
been established rigorously for these short-range models. A phase tran- 
sition in three dimensions (D = 3) appears probable, and one in D = 2 
seems less likely (a phase transition in D = 1 with short range interac- 
tions is ruled out anyway) while, on the other hand, there does occur a 
phase transition in infinite dimensions. This paucity of rigorously estab- 
lished facts on short-range models is amply made up for as one looks at 
the SK model of spin glass (sec. 6.5.7) which is based on infinite-range 
interactions, and which elucidates to a considerable extent the nature 
of the order parameter characterizing the spin glass phase. All spins in 
the SK model are infinitely connected with other spins, in consequence of 
which the geometrical structure of the underlying lattice loses relevance. 
In this sense, the SK model emulates the spirit of mean field models. As 
stated earlier, the mean field approach to spin glass is beset with non- 
trivial problems because of the random interactions between the spins, 
though an adaptation by Thouless, Anderson and Palmer (sec. 6.5.8) has 


met with considerable success. 


6.5.7 The Sherrington-Kirkpatrick model 


The Sherrington-Kirkpatrick model (SK model in brief) is described by the 


Hamiltonian 
1 
H= a ae oh (6-195) 


which is formally the same as (6-183), but where the summation over 


the indices 77 now runs over all spin pairs (involving spins at different 
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sites) in the lattice. Since each spin interacts with all other spins, the 
lattice structure loses relevance, and the site indices can be assumed to 
run through integer values, allowing one to order them (for instance, the 


summation in (6-195) can be replaced with 5> dropping the factor of 


i<j’ 
+). This places all spins formally on an equal footing, though the random 
coupling strengths vary from one spin pair to another which makes the 
problem non-trivial, though tractable. The replica trick can be invoked so 
as to obtain a reasonably complete picture of the spin glass transition in 


the model, of which a brief outline will be included in the present section 


(details of derivations are to be found in [10] and [37]). 


The definition of the SK model is completed by specifying the probability 
distributions of the coupling strengths //;;. All the J;;’s are assumed to be 
distributed identically, with a probability density given by a Gaussian: 

1 (Jig — J)? 


P(Jj;) = <== exp | 


a eye erie) 


where J and A? are the mean and variance of the distribution. In applying 
the replica model, one starts with a finite system size (V) and averages 
over the disorder, finally going over the thermodynamic limit N — oo. 
The idea of self-averaging requires that this should give the correct scal- 
ing of the extensive quantities such as the free energy with the system 
size. However, as mentioned earlier, self-averaging holds at most for in- 
teractions falling off with distance at a certain minimum rate. Since the 
couplings in (6-195) are independent of the separation between the spins, 
one has to assume that they scale appropriately with the system size so 


as to lead to self-averaging. With this in mind, we assume the follow- 
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ing scaling for the mean and variance (the only two relevant parameters 


characterizing a Gaussian distribution) of P(.J;;): 


(6-196b) 


It may be mentioned, however, that the results of averaging over the dis- 
order are not sensitive to the form of the probability distribution, provided 
that the third and higher moments of the distribution remain appropriately 


bounded. 


A non-zero value of the mean (J) introduces a systematic bias towards a 
ferromagnetic (or anti-ferromagnetic; in the following we use the term ‘fer- 
romagnetic’ in an inclusive sense so as to cover both types of interaction) 
coupling in the spin-spin interaction, and pure spin-glass type interac- 
tion corresponds to Jy = 0, while .J(= NA) represents the strength of the 
spin-glass interaction. Together, the three parameters Jj, J and h con- 
stitute a complete characterization of the SK model. As we will see, the 
spin-glass parameter J imparts novel features in the theory as compared 


with those resulting from a ferromagnetic interaction. 


On invoking the replica method as outlined in sec. 6.5.5, we average over 


the disorder by employing the cumulant expansion so as to obtain Z% 
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which, for the Hamiltonian (6-195), works out to [10] 


Zt, = exp F(8)?nN) / I] Xena I] \/ Poa am) 
a<y a 


a NBI 
x exp | — vex S- ge” — “ee » m+ NinTr exp L({q°}, {m™}. {s})]. 


acy a 


(6-197a) 


In this expression g‘°?) (a,y = 1,2,--- ,n), m‘ are auxiliary variables in- 
troduced into the theory as a trade-off against the random variables J;; 
as the latter are averaged over, where the super-indices a,7 attached to 
these new variables range from 1 to n, n being the number of (imagined) 
replicas of the spin system. One observes that Z”, appears as the partition 
function of an equivalent fictitious system with a Hamiltonian depending 
on the set of auxiliary variables {q'°”}, {m‘}, as also on the set of single 
site replica spins {s‘”} (a = 1,2,--- ,n), where each s“ takes up values +1. 
The fictitious Hamiltonian (to be distinguished from the effective n-replica 
Hamiltonian H tl (6-192) in sec. 6.5.6) includes interaction-free quadratic 
terms involving the auxiliary variables (the equilibrium values of which 
will be seen to represent physical quantities of basic relevance) as also 
interaction terms involving the replica spins and the auxiliary variables 


entering through the function L. The latter is given by the expression 


Lg}, {rm}; {8} = (BI? Yo gPs5 + BS (Jom + hs, 


any 


(6-197b) 


where the trace (Tr) denotes a summation over all possible configurations 


of the single-site spins. 


691 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


In (6-197a), a number of terms have been omitted, their contribution to the 
final result for the free energy per site being negligible in the thermodynamic 


limit. 


The free energy per site is then obtained as (refer to (6-191)) 


f=—67 lim = lim 1 (He — 1), (6-198) 


Noo N n>07n 


where, strictly speaking, one has to take the limit n — 0 before the ther- 
modynamic limit N — oo. As mentioned earlier, a change in the order 
of the two limits is of much greater convenience, without having adverse 


effect on the final result. 


Reversing the order of the two limits, one can evaluate the thermody- 
namic limit by invoking the method of steepest descent since, the argu- 
ment of the exponential in (6-197a) is proportional to N. The free energy 


per site is thereby obtained as 


n—0 


ae 1 J i 
f=—6 lim (ear (1 = S glen)” ) _ a > me + ,, Tr exp L}, 
(6-199) 


where a necessary condition for an equilibrium solution is that f has 
to be stationary with respect to variations in q‘*” and m‘°). Note that, 
in (6-197a), (6-197b), the variables gh occur with indices a < 7. We can 
define an extended set of variables q‘°” forming an n xn symmetric matrix 
(q°°) = q7) whose diagonal elements are zero, of which the off-diagonal 


elements are involved in the expression for f in (6-199). 
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The stationarity of f requires that the parameters q‘°”),m‘) in (6-199) 


satisfy 


=) (a7 = 1, 47), (6-200) 


where, however, the limit n — 0 is to be taken at the end. Along with 
conditions (6-200), one also needs the thermodynamic stability condition 
that f is to be a minimum in the space of the relevant variables which will 
be seen to ultimately lead us the appropriate order parameter describing 


the spin glass transition. 


The vanishing of the first derivatives is seen to imply 


Trs(% 6(MeF 
g) = in ————_ 
n->0 Tre! n—+0 Tre! 


(6-201) 


(check this out) with L given by (6-197b). The formulae (6-199) and 
(6-201), along with (6-197b) constitute a set of implicit equations deter- 
mining the free energy per site and the equilibrium values of the param- 
eters g‘°7),m‘. One observes that q‘°? and m‘” appear as expectation 
values of ss and s“ respectively with reference to the effective Hamil- 
tonian —3~'L (refer to formula (6-197b)) subject to the limit n > 0, where 


the s‘ stand for the single-site replica spin variables: 


ge?) = (36% 5) 5, m® = (3). (6-202) 
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Here and in the following we use the notation 


(6-203) 


n>0 Tree ’ 


Thus, one has to distinguish between the fictitious Hamiltonian mentioned 
in reference to (6-197a), the effective Hamiltonian 1 fl introduced in sec. 6.5.6, 
and the function —6~1Z of (6-197b) that features in formulae (6-201), (6-202), 
where the three are seen to play the roles of ‘Hamiltonians’ in different con- 


texts in the theory. 


6.5.7.1 The replica symmetric solution 


Sherrington and Kirkpatrick, and also Edwards and Anderson, opted to 
look at the replica symmetric solution(s) to the implicit equations (6-199), 
(6-201), where q‘*” and m“) are independent of the super-indices a, +. In 
the following, these will be denoted by gq, m. 

The effective Hamiltonian H, me (for any specified value of the system size N) 

given in (6-192) is symmetric under interchange of any pair of replica in- 

dices, a feature that continues to remain even in the presence of a non-zero 

field h. Solutions for the thermodynamic variables that do not break this 

symmetry are referred to as replica symmetric ones. However, as we will see 

below, the replica symmetric solution (also referred to as the SK solution) 

does not correctly describe a spin glass below the freezing temperature T;, 

and one has to look for replica symmetry breaking that leads to an order 


parameter of a new kind. 
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In the case of this replica symmetric solution the equilibrium free energy 
(for system size N and temperature T = (kp@)~') and the corresponding 
values of the parameters g,m are given by the set of implicit equations 


satisfying (6-199), (6-200) self-consistently: 


J)? J 1 yo 2 7 
a ey) (l-—q)- rom? x [. dze = In[2cosh8h(z)], | (6-204a) 


q= = iz dee" tanh? Bh(z), (6-204b) 
m= au i. ae tanh Bh(z) (6-204c) 
V2 Jc 
where 
h(z) =JJqz+ Jom +h, (6-204d) 


Notes on derivation of (6-204a)-(6-204d). The first term in formula (6-197b) 


can be expressed in the form 


(Bs)? J goss) = $NA s()2 — nl]. (6-205) 


acy 


This gives a term —5(5J)? coming from +InTre” in (6-199) that adds to 
the first term within brackets so as to give a contribution BI" (4 — q)* to 


—f. The next term in formula (6-199) simplifies to — 230m? for the replica 
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symmetric solution. What remains in the expression for —{f, is + In Tre”, 
where L’ = 3(8J)?(S~, s‘)? + (8Jom + h) >, 8‘). We now make use of the 
identity e2°” = rE Sie dze— 2” e™*, with } = 1, a = BJ./Gg>_,, 8 so as to 
obtain Tre!’ = iy dze~2* eb Xa8, where b = BIVJqz + B( Jom + h) = 


h(z). But Tre? 2.5 = (2coshb)" since, for each a = 1,2,--: ,n s(@ is to be 
Bh(z). But Tre?X.s'” = (2coshb) fe h ,2,---,n 8 is tob 


summed over the values +1. 


We now go over to n — 0. Observing that, in this limit, (2coshb)” ~ 1+ 


nIn(2 cosh b), we obtain 


1 , ol 1 
—ln'Tre” =— In] [tee #14 nin(2cosh)) 
n 

1 


nN fon 


=~ In{l + oe [ee In (2. cosh(8h(z)))], 


which finally leads to (6-204a) (recall that Tz f dze~2* = 1 and, for suffi- 


ciently small n, In(1 + nA) = nA where A is any finite parameter). 


Formula (6-204b) is obtained from the first relation in (6-202) by noting 


5% seh] 


that, in accordance with (6-203), (ss), = lim, 7! wr For the 
replica-symmetric solution, this expression does not depend on the replica 
indices a,7 (a # 7), and we can take, for the sake of concreteness, a = 
1,y = 2 (say). The denominator tends to unity in the limit n > 0, this being 
consistent with the observation that lim,_,9 ln Tre” is finite in (6-199) (refer 


to above derivation). We thus have to evaluate the numerator alone, where 


we once again make use of the relation 
L 
(uv))) — = (H))2 _ 
expt) (s )] = expl5 (( > ge)? =n), 


(where w,v are replica indices, each summed over from 1 to n), and note 


that e~2 — 1 in the limit considered. This leads to the relation 


dze~ 2” Trs (2) — [ac Pak. — 


q = lim 


1 
n0 \/27 / 
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in the notation used above. But 37”_, s() = s( + s@) + 7"_, s(, Further, 
since the trace involves a sum over all possible configurations of the replica 


spins, we have 


q= iim dze~2*' | [Tr(sVe bal ) Tr(s (2) eds? ‘(THe yy), 


"7 | 


where a is any replica index chosen from 3,---n, and the trace is now over 


configurations (+1) of each relevant replica spin. In other words, 


= lim —— f dee 2* (2sinh b)?(2 cosh b)”~ 


which finally leads to (6-204b). 


Formula (6-204c) follows in a similar manner, and can also be obtained 


directly from (6-204a), and the second relation in (6-200). 


Formulae (6-204b), (6-204c) are referred to as the SK equations in spin 
glass theory, where q has the relevance of the spin-glass order parameter, 
being identical with the Edwards-Anderson (EA) order parameter gp, in- 
troduced in sec. 6.5.6. Indeed, the equilibrium values of the parameters 
and gq and m given by (6-204b), (6-204c) can be interpreted as the spin- 
glass and ferromagnetic order parameters that describe the transitions 
from the paramagnetic phase (q = 0,m = 0 for h = 0) to the spin-glass 


(¢ 4 0,m = 0) and ferromagnetic (q 4 0,m 4 0) phases respectively. 


More specifically, m and q and can be identified as (refer to [10]) 


i= (8), ¢= (8,4, (6-206) 


where, recall that the overline denotes an averaging over the disorder, and 
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the angular brackets denote the thermal average over all possible spin 
configurations of the system under consideration, referred to the Hamil- 
tonian (6-195). What is more, the averaging over the disorder makes the 
system translationally invariant, so that the spin index i has no rele- 
vance in the above formulae. A non-zero value of m indicates a non-zero 
magnetization per site. In the spin-glass phase, there is a randomly dis- 
tributed magnetization, so that the disorder-averaged magnetization per 
site is zero but the squared magnetization is non-zero. In the ferromag- 
netic phase, both the parameters are non-zero. The transition from the 
paramagnetic phase to the spin glass phase has to compete with one from 
the paramagnetic to the ferromagnetic phase. The outcome of the com- 
petition is determined by the relative magnitudes of the parameters Jo, J, 
i.e., by the mean and variance of the distribution of coupling strengths 
J,;. Thus, looking at the formulae (6-204b), (6-204c), we find that, with 
h = 0 (a non-zero external field induces a magnetization by spin align- 
ment), Jo = 0,J # 0 implies m = 0,q 4 0 (Spin glass phase) while, on the 


other hand, J = 0, J) 4 0 implies m, q 4 0 (ferromagnetic phase). 


The SK equations (6-204b), (6-204c) for the spin-glass order parameters 
q, m enables one to obtain the transition temperature 7; in terms of the 
parameters defining the system. On expanding the right hand sides of the 
equations in a power series in b = Bh(z), h(z) = J/qz + Jom +h and eval- 
uating the resulting Gaussian integrals, one gets, for sufficiently small 


m,q,h 


q © B"[(Jom + h)? + J7(1 — 467(Jom + h)*)q — 26°79"). (6-207a) 
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m = B[(Jom + h) (1 — =8%(Jom +h)?) — B?J?(Jom + h)q]. (6-207b) 


Noting that the order parameter gq has to satisfy the requirement gq > 0 in 
accordance with its physical interpretation (second relation in (6-206)), 
one observes that, for Jo = 0,h = 0 (i-e., with no ferromagnetic bias in 
the spin-spin interaction an in the absence of an external field) the only 
solution to the above equations for 6 < J~' is g=0,m =0, while a positive 
solution for g appears for 6 > J~'. In other words, the freezing tempera- 


ture for the spin glass transition is given by 
Te = (kpG:)~', Br = Ria (6-208a) 
with the order parameters from (6-207a), (6-207b) appearing as 


(T>T:) ¢=0, m=0, 
Bry? —1 


(T < Te) q= 


Making use of the formulae (6-207a), (6-207b), where higher order terms 
can be included as necessary, one can work out various thermodynamic 
parameters of interest for T > T; and also for T < T;. Thus, for T > T;, the 


free energy per site is given by 


2 
(T>T) fx—s *m2- _ (6-209a) 
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while the mean energy per site is obtained as 


2 
ve 1 ey (6-209b) 
where 
iT -T:| 1,7 -Tro 
fw | 2 
ge te () (6-209c) 


(check these results out, taking into account terms of next to leading 
order of approximation). As remarked earlier, results for T < T; are only 


formal ones and are not in conformity with experimental observations. 


On including terms with Jp 4 0,h £0, one can evaluate the mean magne- 
tization per site (m), and then the susceptibility , = Om) 49 (in units of ks’), 
which works out to 


7 1— 6? J*q 
SS EE I = ee) 


(6-210) 


(check this out as well; in the present context , does not have the dimen- 


sion of magnetic susceptibility). 


A quantity of particular relevance in respect of the spin-glass phase is the 
nonlinear susceptibility. Expanding the magnetization in powers of h (note 
that, in the present context, m and h do not have the dimensions of mag- 
netic moment and magnetic field strength), the coefficient of the linear term 
is identified as the susceptibility (x) while that of the cubic term (with a 


negative sign) defines the nonlinear susceptibility (yu): 


m=xh—-xmhP+..... (6-211) 
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A tell-tale signature of a spin glass system is that the nonlinear susceptibil- 


ity diverges as T approaches the transition temperature 7;. 


A parameter closely related to the nonlinear susceptibility is the spin-glass 


susceptibility ysc, given by the relation 
2 
Xn1 = B(xsa — 38"). (6-212) 


In microscopic terms, ysc is defined as 


xse = 5 D> ((8i85) — (80) (89), (6-213) 


(recall that the overline signifies an average over the disorder), and is nec- 


essarily a positive quantity. 


With the spin glass parameter J = 0, this gives the Curie-Weiss law for 


ferromagnetism corresponding to the transition temperature 


T. = kg! Jp, (6-214) 


where the susceptibility diverges as T > T.. With J # 0, on the other 
hand, one finds that (6-210) gives rise to a cusp in the susceptibility 
curve at T = T; (we assume Jp << J), analogous to the one in fig. 6-22, 
in apparent conformity with experimental observations. The expression 
for susceptibility worked out from the SK solution is, however, not valid 
for T < T;. A cusp in the susceptibility curve also appears in the Parisi 
theory to be briefly introduced below in sec. 6.5.9, again in conformity 
with experimental observations. The variation of specific heat per site 


with temperature, worked out from (6-209b), (6-209c) is also seen to im- 
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ply a cusp which, however, is contrary to experimental results. This, once 
again, is indicative of the failure of the replica-symmetric solution to the 


SK equations below the freezing temperature 7;. 


The non-linear susceptibility (y,1, see formula (6-211)) worked out from 
the above replica-symmetric solution diverges for T — 7;, as in numerous 
other spin glass models, and the divergence goes like (T—T;)~! — a feature 
of mean field theories. Indeed, as mentioned below in sec. 6.5.7.2, the SK 
equations are obtained in a naive mean field approach in respect of the 
Hamiltonian (6-195) where the approach itself is deficient in an essential 
way. A formulation by Thouless, Anderson, and Palmer corrects for the 


deficiency, as we indicate in sec. 6.5.8. 


A result of great interest emerging in the SK theory is that the transition 
temperature T; depends on the external field h. Indeed, the spin-glass 
transition continues to be a second order one even in the presence of a 
non-zero field, in marked contrast to a ferromagnetic transition. The field 
dependence of the transition can be inferred from the SK equations and is 
depicted graphically in fig. 6-24 in which the curve showing the variation 
of 7; with the applied field h is referred to as the Almeida-Thouless line (the 
‘AT line’ in brief). Above and to the right of the line the replica-symmetric 
solution correctly describes the thermodynamics of the spin glass, with 
the free energy per site given by (6-209a), which corresponds to the spin- 


glass paramagnetic phase, with the susceptibility given by (6-210). 


As mentioned a number of times in the above paragraphs, the replica- 


symmetric solution loses validity below the transition. This corresponds 
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h/J 


paramagnetic 


spin-glass 
k,T/J 


Figure 6-24: Depicting the Almeida-Thouless (AT) line distinguishing between the 
paramagnetic phase and the spin-glass phase in the SK model (schematic) with Jo = 0; 
the transition from the paramagnetic to the spin-glass phase occurs at the field- 
dependent freezing temperature 7;(h), given by (6-215) for a weak field (4 ~ 0); across 
the AT line, the paramagnetic solution becomes unstable and a new stable state, repre- 
sented by the spin-glass phase emerges, where the latter has a complex structure; the 
AT stability line can also be worked out for Jo 4 0 as well, where the phase diagram (now 
worked out with h = 0 for the sake of simplicity) looks as in fig. 6-25. 


to the region below and to the left of the AT line in fig. 6-24 (T < 7;(0) = = 
for h = 0). Close to h = 0, the AT line is given by the formula (refer to [37], 


chapter 3) 


ee, (6-215) 


Below the AT line, the method of steepest descent adopted in evaluat- 
ing the free energy f becomes inconsistent in that the terms discarded 
as being small (compared to the leading approximant) can no longer be 
considered negligible. Indeed, the spin glass transition corresponds to a 
loss of thermodynamic stability, and the stable solution emerging below 
the AT line is not a replica-symmetric one. This becomes evident on look- 
ing at the spin-glass susceptibility ys¢ (formula (6-213)) which, on being 


worked out for the replica-symmetric solution with non-zero h, becomes 
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negative below the AT line. Another signature of inconsistency of the SK 
solution is that the entropy is found to become negative at a sufficiently 


low temperature below the AT line. 


All this can be related to the question as to when the saddle-point con- 
dition made use of for evaluating 7%, by the method of steepest descent 
yields a minimum value for the free energy functional against variations 
of the parameters q‘°7),m‘~. We consider, for the sake of simplicity, the 
special case Jo = 0 when one need not look at variations of the m‘~’s 
(the generalization to a non-zero value of Jp involves no new principles). 
Referring to the integrand on the right hand side of (6-197a), one ob- 
serves that the effect of fluctuations in the parameters q‘*”) depends on 
the eigenvalues of the Hessian matrix 


0G 


(ay.n6) — __ 7 
eo 7, Oqg6oV) q (08)? 


(6-216a) 
where 
1 2 
(a7) = (e772) ger" = 2Q g(o7) g(0) 5) _ 
G({g°}, {8}) = 5(8-) Xf 7 — In Trexp[(3J) 4 Ds]. (6-216b) 
a<y a<y 


In this expression the arguments of G in the left hand side include all the 
q°’s and the single-site replica spins, the trace on the right hand side 
representing, once again, a summation over the possible spin configura- 


tions of the replicas. 
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The Hessian matrix then appears as ([37],[10]) 
Alowns) — (BI)? | Ocais = (BI)? ((s) 8M 5) = (3 3M) (5 5))], (6-216c) 


where the averages ((-)) are defined as in (6-203) (we omit the suffix ‘L’ 
here). For any positive integer value of the replica number n, one can, 
in principle work out the $n(n — 1) number of eigenvalues of the Hessian 
and then look at these in the limit n —> 0. The evaluation of the free 
energy from Z®. by the method of steepest descent based on the saddle- 
point (6-200) then makes sense only if none of the eigenvalues is negative, 
since otherwise the terms left out in the steepest descent approximation 


grow in magnitude. 


For T > J;, the Hessian matrix is diagonal at the replica-symmetric 
saddle-point g‘°”) = q (which constitutes the only solution to (6-200)), and 
the eigenvalues are obtained trivially. In this case one obtains a single 


degenerate eigenvalue 


T 

A = (BJ)2(1— (8)?) = (4)* fa — (F)*] > 0. (6-217) 

The spin-glass susceptibility ygq (refer to (6-213)) can be seen to relate 
directly to \ and is found to be positive (as it should be, by definition), 


diverging as T —> T;. 


For T < T;, however, the eigenvalues based on the replica symmetric 
solution are no longer all positive. One obtains three eigenvalues in this 


case, of which two are degenate and remain non-negative while the third 
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eigenvalue turns out to be negative (recall that the eigenvalues are to be 
evaluated in the limit n — 0), with a corresponding negative value for ysc, 
indicating that the replica-symmetric solution becomes unstable as T is 


made to assume a value less than 7;. 


The results can all be generalized to include a non-zero field strength h 
when the instability is found to set in (with one eigenvalue of the Hessian 
matrix becoming negative) as T becomes less than 7;(h) given by (6-215) 
(refer to fig. 6-24), where 7;(0) = x = T;, the spin-glass transition tem- 
perature in the absence of the field. In stating this result, we have taken 
Jo = 0 for the sake of simplicity (recall that Jo = J;; stands for the ferro- 
magnetic bias in the distribution of the coupling strengths //;;), while the 


stability border for Jj 4 0 can also be worked out. 


The condition for the eigenvalues of the Hessian matrix based on the SK 


solution to be positive turns out to be 


1-—(8J)?(1 -2¢4+7r) >0, (6-2 18a) 
where 
1 = 2 on 
ae 7 . dze~? tanh‘ Gh(z). (6-218b) 


Fig. 6-25 depicts the phase diagram for Jp # 0,h = 0, where one observes 
the boundaries demarcating the paramagnetic phase (m = 0,q¢ = 0), the 
spin-glass phase (m = 0,q 4 0), the ferromagnetic phase m 4 0,¢ 4 0, and 


a ‘mixed’ ferromagnetic phase. The horizontal line T = T; (kpT = J) (line 
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marked ‘1’ in the figure) separates the paramagnetic from the spin glass 
phase for h = 0, while the line kgT = J, (line ‘2’)) marks the paramagnetic 


to ferromagnetic transition (¢ = 0, x > oo, refer to formula (6-210)). 


kT 


paramagnetic 


2 


1 
1.0 


ferromagnetic 


spin-glass 


1.0 Sold 


Figure 6-25: The phase diagram (schematic) for the SK model (eq. (6-195)), depicted 
for h = 0; phases appearing in the various different regions are the paramagnetic, the 
ferromagnetic, the spin-glass phase, and a ‘mixed’ ferromagnetic phase, of which the 
last one does not figure in the replica-symmetric SK solution; the line marked ‘1’ de- 
marcates the paramagnetic from the spin-glass phase while that marked ‘2’ separates 
the paramagnetic from the ferromagnetic phase; the ferromagnetic phase loses stability 
as one crosses the line ‘3’, giving rise to a ‘mixed’ ferromagnetic phase which, however, 
does not feature in the replica-symmetric SK solution of the model; the lines ‘1’ and ‘3’, 
taken together, constitute the AT stability border for Jo 4 0,h = 0 (compare with fig. 6- 
24, which corresponds to Jy = 0,h # 0); the line ‘4’ separates the spin-glass from the 
mixed ferromagnetic phase, and is obtained in the Parisi theory. 


The AT instability for the ferromagnetic phase (to the right of line 2) sets 
in as the temperature is made to go below the line marked ‘3’. In other 
words, the lines ‘1’ and ‘3’ taken together constitute the AT line for h = 0 
but J, 4 0. The replica symmetric SK solution loses relevance below these 
lines. The vertical line Jo = J, SJ > 1 (marked ‘4’ in the figure) depicts the 
transition between the spin-glass and the mixed ferromagnetic phase. 
Recall that the spin-glass phase has m = 0,¢ # 0, and one might attempt 
to obtain the condition for transition to the mixed phase m 4 0,q 4 0 


by equating the denominator of (6-210) to zero. However, this would not 
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work since (6-210) has been obtained for the replica-symmetric solution, 
which loses validity below the line ‘1’. The ferromagnetic and the mixed 
phase are distinguished by the fact that the latter results from the former 
by loss of stability where there are infinitely deep ‘valleys’ in the ‘free en- 
ergy landscape’ (refer back to sec. 6.5.6), and where this loss of stability is 
associated with an irreversible transition (a reversible transition requires 
that the depth of the valleys should increase continuously from zero). The 
line ‘4’ is obtained by assuming that the permutation symmetry among 
the replicas is broken. We will briefly consider this breaking of the per- 
mutation symmetry among the replicas (commonly referred to as ‘replica 
symmetry breaking’) in sec. 6.5.9 below. The SK solution, when consid- 
ered for non-zero values of Jj, predicts a spurious transition (not shown 
in the figure)from the paramagnetic to the spin-glass phase through an 
intermediate ferromagnetic regime for a certain range of values of Jj, and 


does not predict the transition to the mixed ferromagnetic phase. 


6.5.7.2 SK equations in the mean field approach 


The SK theory described in sec. 6.5.7.1 is analogous to a mean field ap- 
proach for working out the thermodynamics of spin glass. The infinite 
range characterizing the spin-spin interaction in the SK model along with 
the assumption of identical distributions of all the coupling strengths 
imply that all the spins are located in environments equivalent to one 
another, in virtue of which one can attempt to set up an analogue of the 
Weiss mean field theory of ferromagnetism as in sections 6.2.3.2, 6.2.3.3). 


Thus, the mean magnetization m; (= (s;)) at any site i can be expressed 
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as (refer to (6-27)) 
m,; = tanh Bh;. (6-219a) 


where hh; is an effective field at site i, representing the action of all the 


other spins, along with the external field h: 


hy =h+ So Jigmy. (6-219b) 
j 

Unlike the Weiss theory, however, formulae (6-2 19a), (6-219a) do not con- 
stitute a closed system because of the presence of the random variables 
J,; representing the disorder, and a self-consistent solution is not imme- 
diately available. Unlike the case of the regular Ising model considered 
in sec. 6.2.3.2, one has to average over the disorder in order to arrive 
at meaningful results. Indeed, the effective field h; at any specified site 
now appears as a random variable and one has to know how the h,’s are 


distributed in order to make progress. 


1. The replica trick constitutes one approach toward solving the prob- 
lem of averaging over the disorder. As we have seen, the replica sym- 
metric solution obtained within this approach does not actually yield 
meaningful results for T < 7T;, and one has to break the permuta- 
tion symmetry among the replicas (See sec. 6.5.9 below) to complete 
the solution. On the other hand, one can also explore the alterna- 
tive (though equivalent) approach of working out the distribution of 
the effective field h; along a more direct route, which we initiate in 


the present section. As we will shortly see, it is not difficult to arrive 
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at the SK equations (6-204b), (6-204c) in this approach, where the 
replica symmetry is honored, while the scenario for T < T; once again 
requires more detailed and complex considerations, briefly outlined in 


sec. 6.5.8 below. 


2. The field h; is termed the ‘effective’ field here, while the qualifying term 


‘meamr is reserved to denote the result of averaging over the disorder. 


We begin by making the assumption (a flawed one, as we will see; refer 
to sec. 6.5.8 for the outline of a more complete approach) that the effects 
of the various different sites at any specified site 1 are independent ran- 
dom variables and invoke the central limit theorem in the thermodynamic 
limit (V — oo), thereby arriving at the conclusion that the effective field 
is a Gaussian random variable. The mean effective field is then obtained 
from (6-219b) and the first relation in (6-196b) as Jom +h. We further 
assume that the magnetizations at the different sites are independent of 
each other and also of the coupling strengths J;;, in which case the vari- 
ance of the magnetization at each site is obtained as N (Jig — Jig)?2m? = Ji 
where the order parameter q is defined as q = m2, In other words, intro- 
ducing a Gaussian random variable z with unit variance and zero mean 
the random variable representing the effective field at each site can be 


represented as J,/qz + Jom+h, which is precisely the expression (6-204d): 


random variable representing the effective field: h(z) = J\/qz + Jom +h. 
(6-220) 


Following (6-219a), we now consider the random variable representing 


the magnetic moment m(z) = tanh(Sh(z)) and the squared magnetic mo- 
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ment q(z) = m?(z) (note that this definition is an implicit one) so as to 
obtain the mean values (i.e., disorder-averaged values) satisfying, pre- 
cisely, the SK equations (6-204c), (6-204b). This derivation of the SK 
equations shares the same limitations as the ones in the derivation mak- 
ing use of the replica trick. The TAP approach outlined in sec. 6.5.8 below 


overcomes these limitations to a considerable extent. 


6.5.8 The TAP approach 


Thouless, Anderson, and Palmer made a decisive improvement over the 
naive mean field approach outlined in sec. 6.5.7.2 that led to a great 
deal of clarification regarding the problem of spin-glass phase transition 
including, in particular, the scenario below the transition temperature 7; 
(recall our use of the term ‘freezing transition’ in this context). The theory 


initiated by them is referred to as the TAP approach. 


Following a suggestion by Onsager, TAP modified the naive mean field 
equation (6-219b) by introducing a correction term representing the ‘re- 
action’ of any specified site (site 7 in the case of (6-219b)) on all the other 
sites that produce the effective internal field at that site. This reaction 
term is to be subtracted from the expression in (6-219b), and the modi- 


fied effective field now appears in the form 
j 


In this expression, the effective internal field at site 1 has been written 


in the form }/, Jijlm; — Jjimixj;] by replacing m,; in the right hand side 
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of (6-219b) with m; — mi; where mi stands for the reaction term expressing 
the magnetic moment induced at site 7 by the spin at site 7 (we replace, in 
the spirit of the mean field theory, the spin s; with its expectation value, 
i.e., with m,). This, has been expressed in the above formula as 


J 


where xj; = org is the local susceptibility at site j that is obtained by 
ei 

taking the derivative of m,; with respect to the local field h; at site j. In 

this case, making use of the expression m,; = tanh(Gh,;) and retaining 


terms up to the first non-leading approximation, one obtains 


Making use of (6-22 1b), (6-221c) in (6-221a), we arrive at the mean field 


model including the ‘reaction term’ correction 


m, =tanh(Sh«), 


hy =h+ So [Jigm; — BIzmi(1 — m§)]. (6-22.14) 
g 


These are referred to as the TAP equations and, for given coupling strengths 
{Ji;}, correctly describe the SK system in the thermodynamic limit. In 
contrast to a ferromagnetic coupling, where the reaction term becomes 
irrelevant in the thermodynamic limit, such a term is a necessity for a 
correct description of the thermodynamic behavior of the SK model since 
it becomes comparable with the conventional term )/, J;;m;. A solution to 


the TAP equations for the set of magnetizations {m;} describes an equilib- 
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rium state of some particular realization of the SK system, corresponding 
to a specified set of values of the random variables /J;;. An averaging over 
the disorder may be necessary for quantities such as the magnetization 
m, at any particular site, while quantities such as limy_,,, }>,m,; are self- 


averaging. 


The TAP equations lead to the correct spin-glass transition temperature. 
To see this, we retain the linear terms in (6-221d), ignoring terms of 
higher order of smallness, so as to find the condition under which small 
non-zero solutions for the m;’s appear from the one with m; = 0 for all 
1, the latter corresponding to the paramagnetic phase. Setting the ex- 
ternal field h at zero, and ignoring terms of degree three in the m,’s, the 


linearized TAP equations appear as 
mi =BY > Jigmy — Pm, (6-222a) 
j 


; Ji, = J’ in virtue of the second relation in (6-196b) 


where we have written 5> 
(reason this out). One can look at the m,;’s as components of a vector |m) 
and the coupling strengths J;; as elements of a matrix J, when the above 


equation appears in the form 
(1+ (8J)?)|m) = BJ|m). (6-222b) 


The theory of random matrices tells us that the eigenvalues of the matrix 


e obey a semi-circular distribution ( [37]) with the density function 


p(A) = JAP? — 2, (6-223a) 


Q7 J? 
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where \ stands for a typical eigenvalue, the largest eigenvalue being 
Athi = 2d. (6-223b) 


If |\) stands for the eigenvector of / corresponding to eigenvalue \ and 


my = (A|m) for the component of |m) along |\), one can write (6-222b) as 
(1+ (BJ)? — BA)my = 0. (6-224) 


One observes that, for 3J < 1, the coefficient 1 + (3)? — 6) is positive for 
A = 2J, and hence for all X, i.e., the only possible solution for the TAP 
equations is given by m) = 0 for all \(< 2/), ie., m; = 0 for all i. On 
the other hand, as 3/ crosses the value 1 from above, the component m2, 
corresponding to the largest \ acquires a non-zero value, i.e., a non-trivial 
solution of the TAP equations emerges with 6 = 1—e (e > 0). This implies 


a phase transition at T = 7; = = as obtained from the SK equations. 


This conclusion is confirmed by taking into account the cubic terms in the 
m,;'s that were left out in (6-222a) (check this out). Note that, if the reaction 
term were not included in TAP equations, the non-trivial solution for m2, 


would appear at J = 2. 


On looking at the TAP equations for T = 7; — 7 (7 > 0), with 7 small, one 


finds that the non-trivial solution is given by 


mi & 72di, (6-225) 
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where ¢; denotes the ith component of the eigenvector of ./ corresponding 


to the largest eigenvalue 2 (check this out). 


The stability of this non-trivial solution can now be checked so as to 
confirm that the condition 6 = 1 does indeed correspond to a phase 


transition. This can be done by referring to the free energy functional 


Fiimi}] = 3D Jaman - a a La )(1 — m?) 


1+™m; 
2 


(6-226) 


sha? [(1 + m;) In +(1—m,)In as 
in terms of which the TAP equations appear as the solution to the varia- 
tional principle oe = 0 (check this out). In writing out the above free en- 
ergy functional we have, for the sake of generality, replaced the uniform 
external field parameter h with a site-dependent field h;. The condition 
of stability of a solution (either the trivial solution where all the m,’s are 
zero or a solution with non-zero m,’s) is that the eigenvalues of the Hes- 


OF 


sian matrix 5 are to be all non-negative, which translates into the 
eUNhy 


requirement 
1— (BJ)?(1 — 2gna +1) 2 0, (6-227) 


where 


N-oo 


aes ol OF os. 24 l 4 
qea = lim sy Dy mi, r= lim Wom: (6-227b) 


This requirement is violated for the paramagnetic solution (m; = 0 for 


all 7) as T becomes less than T;, in which case a new set of solutions 
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to the TAP equations emerge as stable ones. It is to be noted that the 
requirement (6-227a) turns out to be identical to the condition (6-218a), 
(6-218b) for stability of the SK solution, based on which one obtains the 
AT stability line depicted in 6-24. It may be noted that the parameter r 
defined in the second relation in (6-227b) is identical to that defined in 


fig. (6-218b) when one refers to the replica symmetric SK solution. 


There exists a vast literature based on numerical study relating to the 
TAP equations, leading to meaningful and significant results on spin- 
glass states below the AT line, while analytical results are known in the 
limiting situations T — T; and T — 0 (recall that the freezing temperature 
T; depends on the the external field h and also on the parameter Jp that 


determines the strength of the ferromagnetic interaction). 


Among the results arrived at, those relating to the number of solutions to 
the TAP equations are of great relevance. It transpires that, as the stabil- 
ity limit (6-227a) is crossed (i.e., as T becomes less than 7;), the number 
of solutions proliferate exponentially. Denoting the number of solutions 
for any given value of the free energy density f by \(f), one obtains the 
result that, for T < T;, N(f), averaged over all possible realizations of the 


J,;’s, is of the form 


N(f) ~ eX) (wi (f) > 0). (6-228a) 


However, for an arbitrarily chosen T in the range 0 < T < 7;, the average 
value of the number of solutions is not necessarily the most likely value. 


More precise results are obtained for T ~ 0, N — oo in which case one 
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has 


N(f) ~ eX) (0 < wo(f) < wr(f)), (6-228b) 


with probability 1. 


More detailed results are available for h = 0,7 = 0, when the free en- 


ergy reduces to the energy E(~ Ne) of the system. Denoting by (ec) 


the average number of solutions for specific energy ¢«, the variation of 


x nN(e) with —5 is depicted in fig. 6-26, where one finds a maximum 


with +mnWNV(c) © 0.2, and a certain minimum value of «(= —0.67.J) such 
that there appears no solution below that value, shown by the dotted line 
in the figure. Considering the total number of solutions (for all possible 


values of ¢) it is given by 


NN we? ?*, (6-228c) 


since the dominant contribution to this number comes from the value of 


N(e) corresponding to the maximum point of the curve 6-26. 


Analogous results have been obtained for h 4 0,7 #4 0. In particular, there 
appears to exist a minimum value of f (= fo say) such that the number of 
stable solutions to the TAP equations gets drastically reduced for f close 


to fo, when one obtains (refer to (6-228b)) 


wo = mf — fo), (O< <8). (6-228d) 
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yin N(e) 


-e/J 


Figure 6-26: Depicting schematically the variation of ; In/V(e) as a function of — 5, 
where \/(e) stands for the number of TAP solutions at specific energy « for large N, with 
h = 0,T = 0; note the existence of a maximum value of + In/V(c); no solution exists for 


5 < —0.67 (dotted part of the graph). 


The minimum free energy density fp in the above formula is of relevance 
in that, in the thermodynamic limit, the states with this value of the free 
energy density constitute the pure states into which any mixed state can 


be decomposed. 


Here I repeat the warning that the use of the terms pure and mixed states in 
the present context is in deference to common practice and should not cause 
confusion (refer back to sec. 6.5.6): earlier we have referred to these as pure 
and mixed phases; in other words, the term ‘state’ is being used here in a 
loose sense. Properly speaking, a pure state is to be represented by a point 
in the phase space while a mixed state corresponds to an ensemble of pure 
states. On the other hand, the term ‘pure state’ in the above paragraph 
stands for an equilibrium state which, at any specified temperature, is itself 
a mixture of states (represented by points in the phase space); a mixed 
‘state’ (more properly, a mixed ‘phase’) is then a convex combination of pure 


phases. 
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The pure phases with free energy density f) correspond to the ergodic 
components of the full phase space describing the system under consid- 
eration in the thermodynamic limit, where the ergodic components occur 
in pairs related by the global spin flip symmetry of the Hamiltonian. For 
a large but finite system, the solutions to the TAP equations at any suf- 
ficiently large value of ( differ in the values of their free energies F’. As 
the system size N is made to go to infinity, those solutions for which 
+ approach the value fo, i.e., the ones for which the free energies differ 
from the minimum value by amounts that vanish in comparison to N in 
the thermodynamic limit, acquire the significance of absolute minima of 
the free energy, while there occur a large number of other solutions that 
represent local minima. The former correspond to infinitely deep valleys 
in the ‘free energy landscape’ (refer, once again, to sec. 6.5.6) while the 


latter have finite barriers separating them from other minima of the free 


energy. 


The multiplicity of pure phases is indicative of broken ergodicity, when 
the phase space is broken up into disjoint ergodic components. Each 
ergodic component is made up of pure states represented by points in the 
phase space, and a pure phase corresponds to an ensemble of the pure 
states making up the component in question. While there exist a host of 
mixed states that can be described in terms of the pure states within any 
specified ergodic component, these are not of immediate relevance in the 
present context, since these do not describe equilibrium configurations 
of the system. On the other hand, mixed phases representing possible 


equilibrium configurations can be constructed as convex combinations of 
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the pure phases, where these typically involve the pure states (i.e., points 


in the phase space) in all the ergodic components of the phase space. 


Fig. 6-27(A) is a pictorial depiction of the ergodic components of the phase 
space, aimed at making the above statements more transparent while 
fig. 6-27(B) is a similar hypothetical representation of the ‘free energy 


landscape’. 


The large number of pure phases below the transition temperature 7; (= 
iat) distinguishes the spontaneous symmetry breaking in the spin glass 
transition from that in the ferromagnetic transition (refer back to sec- 
tions 6.2.4 and 6.2.5; see also sec. 6.5.6 for general background). In the 
ferromagnetic transition described by the regular Ising model, the phase 
space decomposes in two disjoint ergodic components and the equilib- 
rium state of the system under consideration corresponds to only one of 
the two components in the absence of an external field (h in the present 
context). None of the two ergodic components is invariant under the 
global spin-flip symmetry, which is why the occurrence of any one of the 
two equilibrium states (for h = 0) is referred to as a spontaneous break- 
down of symmetry. In real life, the breakdown of symmetry is not spon- 
taneous since any chance occurrence of an infinitesimally weak magnetic 
field selects out one of the two states (or, more appropriately, the two 
‘phases’ representing equilibrium configurations of the system), thereby 
selecting one of the two ergodic components of the phase space. As men- 
tioned several times in earlier paragraphs, the order parameter distin- 


guishing these two components is precisely the mean magnetic moment 
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an ergodic component, 
made up of pure states 


phase space 


eee 


(A) 


infinite barrier in 
thermodynamic limit fi 


Figure 6-27: (A) a pictorial and schematic depiction of the mutually disjoint ergodic 
components of the phase space; each of the ergodic components (represented as cells) 
includes numerous pure states (the dots shown in some of the cells), where some partic- 
ular ensemble of the pure states describes the pure phase (at times referred to as a pure 
‘state’ at the cost of some confusion) corresponding to the ergodic component in ques- 
tion; a mixed phase, on the other hand, involves a convex combination of pure phases, 
each of the pure phases occurring with some specified weight in the combination; the 
pure phases appear as the solutions to the TAP equations in the thermodynamic limit, 
all having the same value (fo) of the free energy per site; (B) a hypothetical graphical 
representation of the ‘free energy landscape’ for a SK system of large size (NV); a number 
of local minima of the free energy functional (for instance, the ones marked A,B,C) cor- 
respond, in the thermodynamic limit, to a single ergodic component of the phase space, 
with the associated global minimum E); other ergodic components correspond to global 
minima E2, E3 ---; in the thermodynamic limit, all the global minima have the same 
specific free energy f(= fo), and each of these (associated with some particular ergodic 
component) is isolated from the minima associated with the other ergodic components 
by infinitely high energy barriers; for a large but finite value of N, the local and global 
minima arise as solutions to the TAP equations, their total number being exponentially 
large in N; in the thermodynamic limit, only the global minima represent equilibrium 
configurations (pure phases); the number of pure phases is less than exponentially large 
in N; the abscissa represents values of a (hypothetical) single phase space co-ordinate; 
in reality, the landscape involves all the phase space co-ordinates for a large but finite 
system. 
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that takes up distinct values (related to each other by the spin flip sym- 


metry) in the two corresponding equilibrium states. 


In the case of the spin-glass transition, the number of ergodic compo- 
nents below 7; is large, and these are not related by the spin-flip sym- 
metry (more precisely, there occur pairs of ergodic components with the 
members of each component related by the spin-flip symmetry). In the 
absence of an external factor such as a uniform magnetic field forcing the 
system to any one of the alternative equilibrium states, it makes a ’choice’ 
as if spontaneously (in reality the ‘choice’ is determined by chance fac- 
tors, involving the history the system has been through). Since there is 
no obvious rule determining the directions of freezing of the spins, these 
directions are, to all intents and purposes, random. If an equilibrium 
state is to be forced upon the system by an external field, it has to vary 
from one site to another, just along the directions ‘chosen’ by the spins. 
In other words, the magnetic field has to vary from one site to another 
in accordance with the microscopic details of the state it is to force the 


system into. 


In keeping with this aspect of complexity of the transition to the spin- 
glass phase, one faces the problem of distinguishing between the various 
possible ergodic components of the phase space (i.e., equivalently, be- 
tween the possible equilibrium states) by means of an order parameter. 


The single order parameter 


ol ‘ 
qea = Jim 5 ym (6-229) 
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is evidently inadequate for this task because the above formula does not 
tell us which of the various possible equilibrium states the magnetiza- 
tions m; refer to. Indeed, the inadequacy becomes manifest as one notes 
that the Edwards-Anderson order parameter has the same value for all 
the pure phases (pure ‘states’ by commonly employed vocabulary; I feel I 


must keep on reminding you of this little anomaly in terminology). 


One need not include an averaging over the disorder in the above definition 


of gza because of the property of self-averaging. 


One anticipates that an appropriate order parameter distinguishing be- 
tween the various possible ergodic components has to incorporate in- 
formation of some measure of dissimilarity between all possible pairs of 
these components (wherethe members of each pair are related by the 
global spin flip symmetry). This leads to the idea of the overlap function 
to be outlined in sec. 6.5.9 below where we find that, adopting the replica 
approach and breaking the permutation symmetry among replicas, one 
is led to a method for the computation of the overlap function that is of 


practically realizable proportions. 


6.5.9 The overlap function and the breaking of replica 


symmetry 


The content of this section, relating to Parisi’s solution of the SK model 
below the freezing trmprtaure (7; < ;4,) is based principally on that of [102], 


chapter III, [132]. 
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Attempts at solving the TAP equations lead us to the conclusion that, be- 
low the transition temperature 7;, the phase space gets partitioned into 
a large number of ergodic components, each corresponding to an equilib- 
rium configuration of the system, all such pure phases being character- 


ized by the same value of the free energy density f = fo. 


In the following, we label parameters pertaining to the various pure phases 
by the sub-indices such as |aj, [b]. Thus, mj.) = (si) {aj Stands for the mag- 
netization at site i in the pure phase a, where the expectation value (for 
any specified inverse temperature 3) is with respect to the possible spin 
configurations that make up the corresponding ergodic component of the 


phase space. 


In order to arrive at an appropriate order parameter in terms of which 
all the possible pure phases can be distinguished, we define the overlap 
between any two pure phases [a], [b] as 


Gias = Jim vL™ CULIOR (6-230) 


where the case a = b gives the self-overlap gga of a pure phase, which 
is actually the same for all the ergodic components of the phase space 
( [102]). For 7 > 7; there exists only one pure phase for which gga, the 
Edwards-Anderson order parameter, is an useful object (denoted by q in 
sec. 6.5.7) that, along with m = limy_,.. }>,m,; determines the thermody- 
namic parameters of the system. For T < 7;, on the other hand, one 
needs all the overlaps q,»j to characterize the multiple equilibria of the 


system. As we see below all these overlaps are encoded in the overlap 
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function P(q) that now serves as the order parameter — one of a novel 
type, indicative of the complexity of the scenario below the freezing tem- 


perature. 


The overlaps qj») indicate the degree of similarity of the various pure 


phases to one another. Put differently, the quantities 
dian, = (Gna — Aab}), (6-231) 


can be looked upon as a measure of a squared distance between the 


phases |a], [6]. 


While the multiplicity of the pure phases is characterized by the overlap 
function to be defined below, one needs to look at mixed phases so as to 


arrive at an effective method for the computation of this function. 


A mixed phase is a convex combination of pure phases, characterized by 
a set of weights of the latter. Thus, referring to any observable A of the 
system, its expectation value in a mixed phase can be expressed in the 


form 
(A) = S| wta)(A) ce (6-232) 


where wy) (0 < wa < 1) is the weight of the pure phase a in the mixed 
phase under consideration. The summation runs nominally over all the 
pure phases but only those which occur with w,;,; 4 0 are relevant. The 
decomposition of a mixed phase into pure phases is unique, while a pure 


phase cannot be decomposed in terms of other pure phases. Pure phases 
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are at times referred to as extremal ones among the set of all mixed and 


pure phases. 


The fundamental distinction between a pure and a mixed phase lies in the 
behavior of their correlation functions: generally speaking, in the case ofa 
pure phase a correlation function such as (s;s;) tends to (s;)(s;) whenever 
the separation between the sites 7,7 goes to infinity while, in the case of 
a mixed phase there does not arise the question of such a loss of corre- 
lations among distant spins. A similar statement holds for more complex 
correlation functions as well. In the particular instance of the SK system, 
where the interactions among the spins are of an infinite range, all the 


spin pairs are effectively at an infinite separation, and we have 


(8i18i2°** Sik) [a] = Mi fali2[a] °° * Mikfa) (6-233) 


where 71, %2,:-- ,ix denote any specified set of sites, all distinct. Again, 


such a decomposition does not hold for a mixed phase. 


In the following we consider those solutions to the TAP equations that 
correspond, in the thermodynamic limit, to the pure phases of the system 
(referred to by means of the sub-index |a]), each characterized by the 


specific free energy fo. For a system of large but finite size N, the free 


Fra 
N 


energies(Fj,)) of these solutions differ by small amounts such that — fo 


for all a. 


We now consider a mixed phase (one made up of the above solutions to 


the TAP equations for a large but finite system) for which the weights wy, 


726 


CHAPTER 6. STATISTICAL MECHANICS OF INTERACTING SYSTEMS II 


satisfy 


Wa] X eW PFla} (6-234) 


i.e., one for which all the weights are equal in the thermodynamic limit, 
and refer to this, following [102], as the Gibbs state. The Gibbs state is of 
central relevance in Parisi’s theory relating to the breaking of the permu- 


tation symmetry among replicas, to be outlined in the present section. 


More commonly, however, the term Gibbs state refers to any and every equi- 
librium state of an infinitely large system for any specified value of the in- 
verse temperature , defined as one that satisfies the DLR conditions (refer 
back to sections 5.6, 6.2.5) In the present context, however, we follow the 


terminology of [102]. 


We now define the overlap function P(q) as the probability distribution 
function of the overlaps qj averaged over all pairs of pure states [a], [0] 
where each of the pure states is weighted by the factor wee 4! of the form (6-234) 
and where the dependence on the realization ({.7]) of the random coupling 
strengths J;; is explicitly referred to. One needs additionally to average 


over the disorder so as to arrive at 


P(q) = P(g) = Doma wi 9(dhan) — 4): (6-235) 


where P!/!(q) denotes the overlap distribution function for some particu- 


lar realization of the disorder denoted by [7]. 
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In the case of a ferromagnetic transition, P(q) is a delta function (6(q¢—m?)) 
in the presence of a weak magnetic field (m = spontaneous magnetiza- 
tion), while at zero magnetic field, the distribution P(q) involve two delta 


functions, one at q = m? 


and the other at —m?, arising due to the spin 
flip operation. In the spin-glass transition, on the other hand, P(q) is of a 
more complex structure in virtue of the occurrence of multiple equilibria. 
In order to compute P(q), it is fruitful to know its moments of various 


orders. As a first step, we refer to the Gibbs state for the particular case 


of some specified realization of the disorder and define 


gh — = Yilsidts) =~ 2, wll (si) tay) (>, why (82) py) 


i=1 


= Duh wl gia = / dgP!(q)q, (6-236a) 


(check this out), where the large N limit is implied. This, in other words, 


is the first moment of P!/](q). The higher moments are similarly defined: 
(k) l N N 
Sa ee [Prag (k=1,2,---).  (6-236b) 


4=1 9 ip= ll 


Starting with these sample-dependent moments we now define the disorder- 


averaged moments gq") (k = 1,2,---) as 


g) = qa = / dqP(q)q*, (6-237) 


where P(q) stands for the disorder-averaged overlap distribution func- 
tion. It may be noted that in an expression such as (6-236b), one has 


to consider arbitrarily large values of V, but the averaging over spins is 
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avoided if one averages over the disorder as 


gq) = (8418i9-++ $in)2 (kK = 1,2,---), (6-238) 


with 71, %2,--- ,i, all different but otherwise arbitrary. 


Alternative expressions for the moments q“) are arrived at by invoking the 
replica trick for the averaging over the disorder, which gives (refer to [37], 


chapters 2,3, [102], chapter 3) 


=i oy St (ger)*, (6-239a) 


ga a, (6-239b) 


In this expression, the average is with respect to the effective Hamiltonian 
H.¢ given by (6-192) (see also (6-202)). In the case of the phase space be- 
ing made of a single ergodic component (i.e., for T > T;), where the replica 
symmetric solution holds, one has q‘°”) = q, the SK order parameter given 
by (6-204b) (this is to be distinguished from the variable q used above 
as the argument of the overlap function P(q); in the remainder of this 


section, we use the symbol gq in this latter sense). 


With the moments gq") obtained as in (6-239a), the overlap function P(q) 
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works out to 
PG) = ea) >_ 4(q'e? — 9), (6-240) 


from which the q) can be obtained back by invoking (6-237). Equa- 
tions (6-235) and (6-240) are two complementary formulae for the over- 
lap function. The former is of a conceptual and abstract nature since 
there is no way to identify and describe the ergodic components of the 
phase space without subjecting the system to an external magnetic field 
which, however, is to have a local variation exactly mimicking the equilib- 
rium state one is seeking to describe, there being a large number of such 
states that one needs in order to work out the right hand side of (6-235). 
Equation (6-240), on the other hand, leads to concrete determination of 
the overlap distribution P(q) and the moments q“) on the basis of Parisi’s 
ansatz relating to the breaking of the permutation symmetry among the 


replicas. 


Recall that, for a description involving n replicas, where n is a positive 
integer, q°°?) represents an element of a n x n symmetric matrix of which 
all the diagonal elements are zero. In going over to the limit n — 0, one 
has to continue in n to all real positive values, which means that the 
‘zero-dimensional matrix’ is to be defined with reference to a very large 
space involving n x n matrices of all positive orders (refer to [102]). In 
other words, the search for a stationary value of the free energy func- 
tional (6-199) is to be conducted in a space of large dimension, and the 


replica-symmetric ansatz q'*”) = q (we revert here to the use of the symbol 
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q as the SK order parameter) is too restrictive in the case of broken ergod- 
icity, as is made evident by the fact that the resulting replica-symmetric 


solution becomes unstable for T' > T;. 


A more exhaustive search is made possible by the Parisi ansatz in which 
the permutation symmetry among the replicas is broken in stages in a 
hierarchical manner where the elements g‘*”) and the resulting function 
P(q) are worked out so as to satisfy a number of physical requirements 
(e.g., P(q) has to be a positive function, the requirement of stability is 
to be met with, and there has to be agreement with results obtained by 


numerical simulations). 


At the end of the final stage of replica symmetry breaking, the g°7)’s are 
represented by a function g(x) that determines f, the latter being now 
a functional of q(x), with x representing the probability that the overlap 
(between two arbitrarily chosen equilibrium states) lies between 0 and q 
(see below). Sums over replicas are replaced with integrals over x, with 
x varying from 0 to 1. The equilibrium value of f and the correspond- 
ing function q(x) are determined by looking for the stationary solution 
for which the functional derivative of f[q(x)| is zero. The equation deter- 
mining the functional f[q(x)| is of a complex form, and one arrives at a 
simpler form close to the transition temperature 7;, i.e., for small values 


of 


(6-241) 


by ignoring a certain term of degree four in q(x) (this is referred to as the 
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‘Parisi approximation). 


For 6 > 0, one has the paramagnetic phase of the spin glass with a constant 
function. The spin-glass transition occurs as 6 crosses the value 0 from 
above, when the constant function ¢(= gga) gives rise to a non-trivial q(x) 
which now determines the thermodynamic parameters of the spin-glass in- 
cluding, in particular, the overlap function P(q). In the present context, the 
scenario for sufficiently small values of 0(< 0) is sought to be outlined by ap- 
proximating f[q(x)| in a Landau expansion up to terms of degree four in q(x) 
where, moreover, the Parisi approximation is made by discarding a fourth 
degree term that is supposed to be of no direct relevance. One assumes, 


moreover, that Jo = 0,h = 0. 


The relevant expression for the free energy functional turns out to be [10] 


Pfla(x)] = Bfot+ al dx||0\q° (x) + a(2) — 570°(2) 


— q(x) / : gf (y)dy] (6-242a) 


where fy stands for the free energy at q(x) = 0. One needs the function 
q(x) corresponding to the stationary value of f for a small non-zero value 


of 0. The condition for a stationary solution then works out to 
1 
2|8| — 2xq(x) — 2 | q(y)dy + 2¢°(x) = 0, (6-242b) 


from which one obtains the equilibrium form of ¢(z) shown in fig. 6-28(A). 


One finds that, with h = 0, there exists a certain value (7,,;) of x such that 
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h=0: 
1 
q(x) © 5a tor 0 t= Ginx 


Ge) =O nae) 10? Cy oe <1. (6-243a) 


More generally, the function q(x) for a small non-zero value of h is of 
the form shown in fig. 6-28(B) where there appear two plateau regions 


depending on values @nin, Zmax Of the variable x such that 


RAO: 

q(x) = OG Cmin) for US ae < Xmin; 
1 

q(x) x 3” for Xmin — x < Umax) 


G7) =O Gine) 10? ae = @ <A, (6-243b) 


and where z,,i,, Which determines the lower of the two plateaus, is given 


by 


3 2 
5tmin = 40) © 7 (s5)*. (6-243c) 


The inverse function of q(x) is x(q), defined as q(x(q)) = g, and is related 


to the overlap function P(q) as 


q 
x(q) = | P(_/)dq, (6-244a) 
0 
which means that, knowing the function q(x) and inverting it, the overlap 
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min 


Figure 6-28: Depicting the variation of overlap q against probability x for two arbi- 
trarily chosen equilibrium states to have the overlap between 0 and q (schematic); (A) 
for h = 0, and for T < 7;, there occurs a plateau g = qmax for t > %max, the latter corre- 
sponding to the case when two states are identical; thus qmax stands for the self-overlap 
of an equilibrium state; (B) for a non-zero value of h, there appears a second plateau for 
0 <2 < @min, Where Zpin Stands for the probability of the overlap between two arbitrar- 
ily chosen states to lie between 0 and gmin, the minimum possible value of the overlap 
for the given h and T; in either case the inverse function x(q) determines the overlap 
function P(q) as in (6-244b). 


function is obtained as 


Pq@= “x(a) (6-244b) 


In other words, as mentioned above, x(q) represents the integrated prob- 
ability that the overlap between two arbitrarily chosen states lies between 
0 and q. Thus, xmax is the probability that the overlap lies between O and 
the maximum possible value gnax = 9(Ymax) (in which case the two states 
are the same and q measures the self-overlap) while 2,,;, stands for the 
probability that two pure states have an overlap between O and the min- 
imum possible value gnin = q(@min) (recall that x,,;, depends on the field 
parameter h, and is zero when h = 0; tax Carries a much weaker field 


dependence). 
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Considering the general case of a non-zero field parameter h, one observes 
from the discontinuity in the derivative of q(x) at x = @min and © = Xpax that 
P(q) involves two delta functions at ¢g = qmin = q(Tmin) ANd g = Gmax = O(Lmax) 
respectively, as shown in fig. 6-29, the two delta functions being interpo- 
lated with a continuous variation in g. One of the two delta functions (the 
one at denoted by gnin in, disappears at gnin = 0 for h = 0; while at a criti- 
cal field strength h., the two delta functions merge into one (pin and 2max 
coincide), when one has gmin = dmax = Gea, the Edwards-Anderson order 
parameter. At any given temperature T < 7;, h, is nothing but the field 
strength for the corresponding point on the AT line (refer to fig. 6-24), and 
is given by (refer to (6-215)) 


=), (6-245) 


Thus, for T > T;, h. > 0 and qza — 0, and one recovers the trivial replica 


symmetric solution gq = 0. 


Pq) 


min max 


Figure 6-29: Depicting schematically the overlap function P(q) for the general case 
0 < h < h, (see text), in which case the graph of P(q) includes two delta functions 
with a continuously interpolating part between the two; the delta function at ¢ = qnin 
disappears as h — 0, while the two delta functions merge into one single delta function 
as h goes to h, corresponding to the given temperature T(< T;). 
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In figures 6-28, 6-29, only positive values of g are shown. In reality, pure 
phases occur in spin-reversed pairs, and the overlaps q may be positive 
or negative when there is no site-dependent weak local field to make one 
set of pure phases preferred over the spin-reversed phases, and when the 
occurrence of any particular pure phase signifies a spontaneously broken 
spin-flip symmetry. A vanishingly weak local field with some appropriate 
site-to-site variation on the other hand, forces the system to choose between 
spin-flipped phases. Assuming that all the pure phases are chosen with 
such appropriate local fields one may, in principle, arrange matters such 
that only positive values of qg are relevant. In contrast, both positive and 
negative values of g would be equally likely in the case of spontaneously 


broken spin-flip symmetry, when P(q) would be a symmetric function of q. 


In the spin-glass phase, the minimum possible overlap gnin = q(0) depends 
only on hf as in (6-243c), while the self-overlap qnax = dea has a much 


weaker dependence on h, and depends on T = 7;(1 + 6) as [10] 


dmax * || + |8|*| — |6|* + O(18I"). (6-246) 


Knowing q(x) (or, equivalently, the overlap function P(q)) and m the mean 
magnetization, one can work out the thermodynamic parameters of the 
SK spin glass. Results are to be found in [10], [37], and [102]. While 
Parisi’s replica symmetry breaking ansatz, explained in details in [102], 
appears to lack in solid mathematical justification, it has been amply 
validated by numerical computations covering a wide area. There has 


also been progress in clarifying the mathematical basis of Parisi’s theory. 
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In particular, the discovery of ultrametricity (see [102]) in the space of all 
possible overlaps is indicative of a deep mathematical structure, having 


physical relevance. 


The acid test of the replica symmetry breaking ansatz lies in the require- 
ment of non-negativity of the eigenvalues of the Hessian matrix obtained 
from the free energy by working out the second order functional derivative 
with respect to the function g(x), evaluated around the stationary point 
given by (6-242b). In this context, recall that one of the eigenvalues of the 
Hessian matrix evaluated at the replica-symmetric solution of sec. 6.5.7.1 
turns out to be negative for T < T;, indicating that the replica-symmetric 
solution is not valid below the transition temperature. In the case of the 
replica symmetry breaking solution, one obtains two groups of continu- 
ously distributed eigenvalues, all non-negative. All the eigenvalues of one 
of these groups are positive, while those of the the other group extend 
down to zero. In addition, there arise some isolated zero eigenvalues. In 
any case, the absence of negative eigenvalues indicates that the instabil- 
ity of the replica symmetric SK solution below the freezing transition is 
cured in the solution arrived at by means of the Parisi ansatz. The linger- 
ing marginal stability, associated with the zero eigenvalues is indicative 


of the complexity of the spin-glass phase that remains to be explored. 
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Chapter 7 


Interacting systems III: bosons 


and fermions 


7.1 Interacting bosons and fermions: introduc- 
tion 


Non-interacting systems made up of large numbers of identical bosons 
or identical fermions referred to, respectively, as the ideal Bose gas and 
the ideal Fermi gas were considered in section 3.3. It was observed that, 
at sufficiently high temperatures, the specifically quantum mechanical 
features of an assembly of identical non-interacting particles can be ig- 
nored to a good degree of approximation, and the macroscopic proper- 
ties of the system can be accounted for satisfactorily on the basis of the 
Maxwell-Boltzmann distribution. At relatively low temperatures, on the 
other hand, when the thermal de Broglie wavelength of the particles of the 


gas becomes comparable with their mean separation, quantum effects as- 
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sume overwhelming relevance, since the quantum mechanical symmetry 
requirements on the possible states of the system of particles turn out to 


be of special significance. 


However, non-interacting systems are idealized ones and are relevant only 
as points of departure, with reference to which one can analyze and ex- 
plain the behavior of interacting systems. The behavior of an interacting 
system at high temperatures (T >> Tp, see formulae (3-60a) - (83-61b)) 
can be described in terms of small corrections over formulae obtained 
from classical statistical mechanics. Such corrections were considered in 
the context of the virial expansion in section 4.2 where we saw that the 
deviation of the quantum mechanical partition function of a gas from the 
classical one can be expressed as a sum of small correction terms in in- 
creasing powers of the Planck constant h. Referring to formula (4-56), one 
observes that the leading correction, which is of the order of h?, depends 
on the strength and range of the interaction between the constituent par- 
ticles, and does not depend on the symmetry requirement that distin- 
guishes between bosons and fermions. The next correction, on the other 
hand, which is of the order of h?, does make such a distinction and arises 


even for an ideal gas. 


In the present chapter, we focus on the low temperature regime where 
one can no longer describe quantum effects as small corrections. The 
point of departure is now the degenerate ideal gas made up of bosons or 
fermions (refer to sections 3.3.7.2 and 3.3.6.2) where one finds that the 
behavior of the quantum mechanical ideal gas deviates strongly from that 


of the classical ideal gas. In the case of a Bose gas, this is revealed by the 
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phenomenon of Bose condensation while in the case of the Fermi gas one 
observes a substantial departure from classical behavior as revealed by 
the temperature dependence of the internal energy and the specific heat 


at constant volume. 


In order to illustrate how quantum principles are invoked to explain the 
macroscopic behavior of interacting bosonic and fermionic systems in 
the low temperature regime (T'S 7p) and to reveal novel phenomena as 
compared to the behavior of corresponding non-interacting systems, I 
present below in bare outline a number of basic concepts in the theories 
of weakly interacting dilute Bose gases, superfluidity, Fermi liquids, and 
superconductivity — all four of great relevance in the physics of systems 


at low temperatures. 


We will, however, confine ourselves to a consideration relating to static 
or equilibrium configurations alone, and will not look at dynamic (i.e., 
non-equilibrium) properties which, however, are of major relevance. The 
basic principles of non-equilibrium evolution of systems will be taken up 
in chapters 8, 9, and 10. In particular, chapter 8 includes the principles 
underlying the theory of linear response, where relevant non-equilibrium 
properties (viz., various transport coefficients) are determined in terms 
of equilibrium averages of products of observables evaluated at unequal 
times. These principles are, however, of a general nature, and transport 
properties of specific systems, such as the ones considered in the present 
chapter, will not be considered (refer to [113] for a number of results in 


this regard). 
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We begin by briefly sketching, in sec. 7.2 below, the many-body formalism 
for the description of systems of identical particles in quantum theory, 


commonly referred to as ‘second quantization’. 


7.2 Many-body formalism for identical parti- 
cles: a brief outline 


A large part of the explanation of the phenomenon of superfluidity is 
based on the quantum theory of systems of identical bosons. In the expla- 
nation of superconductivity, on the other hand, one needs the quantum 


theory of systems of identical fermions. 


In describing a system of identical bosons or fermions one needs to start 
from a complete orthonormal set of state vectors for a single particle con- 
sidered all by itself, spanning a vector space H and then, for a system of 
N particles (with or without interaction among themselves), to consider 
the direct product H!! =H” @---@H™) of N copies of H, with reference 
to which the possible states of the assembly of NV number of identical par- 
ticles are identified as those belonging to a certain subspace. The latter 
is specified in accordance with the symmetry requirement to be satisfied 


by the \-particle states. 


Let S be an orthonormal basis of #, made up of vectors |u;), |u2),--- (we 
assume that S is denumerable), in terms of which any single-particle 
state can be expressed as a linear superposition. This corresponds to a 


basis S“ in H™ (a = 1,2,---) where the super-index a distinguishes be- 
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tween the various particles of the system, the distinction being a notional 
one since the admissible states of the system make no distinction among 
the particles (thus each S“) is a copy of S, as each H° is a copy of H). A 
typical basis state belonging to S‘ will be denoted by us) GS eee) 
where the upper and lower indices (a and j) will be referred to as the 


(notional) particle index and the single-particle state index respectively. 


1. The single-particle basis states |u,),|u2),--- in H (and the correspond- 
ing states |uS), |uS),--- in H( (a = 1,2,---N)) are assumed to be or- 
dered in some specified manner. Commonly, these are taken to be the 
eigenstates of some relevant single-particle Hamiltonian, in increasing 


order of energy. 


2. For instance, in the case of a particle confined within a cubical box of 
side L, one can consider plane wave states satisfying periodic bound- 
ary conditions, in which case the symbol u (in wu; (j = 1,2,---)) stands 
for a 3-vector k (the wave vector) that can take up values from an in- 
finite set of distinct 3D vectors, such that each of the Cartesian com- 
ponents of any one of these vectors (with Cartesian axes parallel to 
three orthogonal sides of the cube) is of the form =n (n = 1,2,---). 
All such 3D vectors are necessary to exhaustively enumerate a basis 


for the possible single-particle states and constitute the set S in this 


instance. Recall that each of the sets S“ is a copy of S. 
A basis for H!! is then made up of direct products of the form 
JN Joa 


N 
TT oles?) = beg?) © @ eG?) @ MH) = begs HDs ty) (ZWD 
a=1 
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where the last expression is introduced by way of a convenient notation 
and where each of the sub-indices j, (a = 1,2,--- , N) can take up a posi- 


tive integer value, labeling some single-particle state |u,;,). 


However, a vector of the form (7-1) is not, in general, an admissible state 
for the assembly of identical particles under consideration, because an 
admissible state has to satisfy a definite symmetry requirement under the 
operation of permutation of the sub-indices labeling the single-particle 
states j1,J2,°°: ,Ja,-** jn in Te, alu?) (one can equally well refer to a 
permutation of the super-indices labeling the various factor spaces) and, 


accordingly, has to be in the form of an appropriate superposition of 


direct product vectors. 


In the case of bosons, an admissible state has to be completely symmetric 
under permutations of the sub-indices j;, j2,--- ,ja,---,jn, and can be 


expressed as a symmetrized superposition of product vectors of the form 
[bosons i heer mr Uj (eh) = Np » Jury, tt SUPja Pie) (7-2) 
P 


where Np stands for a normalization constant, and the summation on 
the right hand side is over all possible permutations of the state in- 
dices, a typical permutation being denoted as P, under which the set 
{J1;J2;°°* sJa,**: , Jn} gets transformed to {Pj,, Pjo,--- ,Pja,--:,Pjn}. For 


instance, for a system of 3 particles and three distinct single-particle 
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states |u1), |u2), |u3), the symmetrized bosonic state is 


1 
|{ur, Ue, us }) = 5 (la, ta, Us) + |u2, Ug, U1) + ug, U1, U2) 


+ |u2, ui, U3) + Jur, us, U2) + Jug, U2, U1)). (7-3) 


In equations (7-2), (7-3), the particle index a (= 1,2,--- , N) is not explicitly 
displayed since, in a product vector, the position of a factor in the product 
is indicative of the factor space H”) it belongs to. In the following, we will 
continue to keep the particle index implied while referring to product vec- 
tors making up the symmetrized or antisymmetrized states till we come 
to many-particle operators (we will, in particular, consider those that can 


be expressed as sums of single-particle or two-particle ones). 


In contrast to the case of bosons, an admissible basis state for a sys- 
tem of N identical fermions is required to be a completely antisymmetric 
superposition of direct product states. Thus, starting from the unsym- 
metrized product vector |u;,,--- ,uj,,:+* ,Uj,) (in order of increasing values 
of the particle index a; the sub-inices j,,--- , jy take up positive integral 
values subject to restrictions imposed by the anti-symmetry requirement, 


see below), one generates the completely antisymmetric basis state 


[fermions ] {us mr Ujas tt stg) — Np YS" Cplupas mt UP jas UP) 
P 
(7-4) 
where A; is again a normalization factor, and ¢p stands for the parity 


of the permutation P (+1 for an even permutation and —1 for an odd 


permutation). For instance, with N = 2, the product vector |u;uz) (the 
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commas within expressions for product vectors are often done away with 


for the sake of brevity) is antisymmetrized to 
1 
[iuru2}) = 57(|uau2) — |uaesr)), (7-5a) 
while, with N = 3, the product vector |u ;u2u3) gets antisymmetrized to 


1 
|{u1, U2, us }) =3, (lusu2us) + |ugugu1) + |uguiua) 


— |w2U1 U3) = |w1U3U2) = |uzugur) ). (7-5b) 


In (7-2) and (7-4) the completely symmetrized or completely antisym- 
metrized state (for bosons and fermions respectively) generated from the 
product vector |u,;, ---u;, ---u;,) has been denoted by |{u,;,,--- ,uj,,-°° ,Ujy)}- 
With reference to such a state for an assembly of fermions (eq. (7-4)), it 
is necessary that the single-particle labels u,,--- ,uy be all different since, 
otherwise the prescription for antisymmetrization results in a null vector 
(reason this out). In other words, no single-particle state can be occupied 
by more than one fermions in an assembly (Pauli’s exclusion principle), 
this being the restriction alluded to above. In the case of bosons, on the 
other hand, the symmetrization prescription imposes no such restric- 
tion, and a single-particle state can accommodate an arbitrary number of 


bosons (recall that this is what happens in BE condensation). 


Incidentally, in a representation of the form (7-2), (7-4), the ordering of the 
state labels u;,,--- ,uj;, is immaterial since a rearrangement of the labels 
results at most to a change of sign. The super-index a is now immaterial 


in virtue of the complete symmetry or complete anti-symmetry as the case 
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may be. 


The number of particles in a specified single-particle state in an N-particle 
bosonic or fermionic state is referred to as the occupation number of that 
single-particle state. Evidently, the occupation number of any single- 
particle state can be either O or 1 in the case of fermions while it can 
have any non-negative integral value (ranging from N to N) in the case of 
bosons (the label for an unoccupied single-particle state does not, how- 
ever, occur in the representation (7-2) or (7-4)). The occupation number 
representation of a bosonic or fermionic state involves the specification of 
the labels for the occupied single-particle states as also the correspond- 
ing occupation numbers. For instance the notation |{u,[2],u2}) can be 
used to denote the 3-particle bosonic state |{ujuju2}). In the case of 
fermions, the occupation number of all occupied single-particle states 
is necessarily unity, and need not be mentioned explicitly. In an alter- 
native (and more commonly employed) form of the occupation number 
representation, one specifies an ordering of all the single-particle states 
(e.g., eigenstates of the single-particle Hamiltonian H in increasing or- 
der of energy, as mentioned earlier) and then lists the occupation num- 
bers of all these single-particle states, arranged in the specified order, 
where single-particle states with occupation number zero are also listed. 
Thus, with a specified ordering of the single-particle states in a basis S 
(Jui), |w2),--- ,]u;),---3 one usually has an infinity of these basic states), an 


N-particle state in the occupation number representation is of the form 


)w!) = [{n1, Ne, Oe Bee an (7-6a) 
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where each of the numbers n, (j = 1,2,---) can have a non-negative inte- 


ger value (the occupation number of the state |u;)) such that 


Song =N. (7-6b) 


The occupation number representation was invoked in section 3.3.1 to work 
out the statistical mechanics of ideal bose and fermi gases, making use of 


the grand canonical ensemble. 


Recall that in the occupation number representation, the particle index a 


has no relevance. 


States of the form (7-6a) are all assumed to be normalized. In this nota- 


tion, the bosonic state represented above as |u[2]u2) appears as |{2,1,0,0,0,--- 


Similarly, a fermionic state with one particle each in the single-particle 
states |u,),|u3) and with all the other single-particle states unoccupied, 


appears as |{1,0,1,0,0,---}). 


In many-body quantum theory, one often works with states for which 
the number of particles is not specified. In other words, one has to con- 
sider superpositions of states with various different numbers of particles. 
Thus, if ¥!‘) denotes a (symmetrized or antisymmetrized) state with N 
number of particles, as in (7-6a), then one needs to refer to arbitrary su- 
perpositions (with appropriate normalization) of these with various differ- 
ent values of N and with various possible distributions of the occupation 


numbers. 
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In working with such many-body states, the formalism, referred to as 
the formalism of second quantization, based on creation and annihila- 
tion operators, is of great advantage. We start from a normalized state 
|0), termed the vacuum state which is assumed to be uniquely defined 
and represents the ‘zero-particle state’, from which single-particle states 
(|u1), |u2),--- , /uj),:-:) are generated with the help of the creation operators 
as mentioned below. Making use of the algebra of the creation and anni- 
hilation operators, more general states of the form (7-6a) can be related 


to one another with little difficulty. 


Though |u,;) (j = 1,2,---) denotes a single-particle state, it is, at times, taken 
to stand for the many-body state |{0,0,--- ,0,0,1,0,0,---}) in the occupation 
number representation where 1 occurs in the jth position — according to 
the notation explained above this stands for a state in which the occupation 
number of the jth single-particle state is unity while all other occupation 
numbers are zero. One can discern from the context as to whether one is 


talking of a single-particle state or a many-body state. 


We denote the annihilation operators by 4; (i = 1,2,---), which satisfy 


G;|u;) = |0), (7-7a) 


or, more generally, in the occupation number representation, 


Gil{m1,Ma,°++ ME =N, Nyt, }) = Vn {n1,n2,°° ng =n — 1, M441,°°+ }). 


(7-7b) 
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In other words, 4d; (at times referred to as a lowering operator) lowers the 
number of particles in the single-particle state |u;) by one, while leav- 
ing unchanged the numbers in the other single-particle states. The fac- 
tor \/n in eq. (7-7b) ensures consistency with the commutation relations 
(or anti-commutation relations, as the case may be; see below) between 
the annihilation and creation operators, assuming that all states of the 


form (7-6a) are normalized. 


Once again,it is to be read from the context as to whether an operator such 


t 


as G; or @; is meant as one acting in a single-particle or in a many-body state 


space. 


The creation operators (a! (i = 1,2,---)) are the Hermitian conjugates of 


the corresponding annihilation operators and satisfy 


al |{n1,no,--- 2; =, Ni41,°+* }) = Vn + 1]{m, ne, --- nm; =N4+1, nig1,--- })- 
(7-8) 


1s. at (at times referred to as a raising operator) raises the number of 
particles in the single-particle state |u;) by one, while leaving unchanged 


the numbers in the other single-particle states. 


The raising and lowering operators 4;, 4! (i = 1,2,--- ,j,---) introduced above 
are specific to the set of single-particle states |u;),|u2),---,|u,;),---. Fora 
basis 7 different from S (in H), one would have a different set of raising and 
lowering operators. In the following, we will refer to raising and lowering 


operators in the co-ordinate and momentum representations. 


749 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


The raising and lowering operators satisfy between themselves a set of 
commutation relations (for bosons) or ant-commutation relations (for fermions) 
which we now state (relations other than the ones listed can be obtained 


as consequences of the latter): 


[bosons :] [4;,4,]=0, [d;, 4] = 16, (4,7 = 1,2,---), (7-9a) 


[fermions :] {4;,4;}=0, {4,44} = 16; (i,j =1,2,---). (7-9b) 


[notation:for any two operators A, B, [A, B =AB-—BA (commutator), and 
{A, B} = AB+ BA (anti-commutator); / stands for the identity operator in 


the relevant state space]. 


The commutation or anti-commutation relations imply that the operator 
N; = ala, (7-10a) 


is the number operator for the single-particle state |u;), i.e., in the occu- 


pation number representation 
Ny|{ni, no, nr Ny =, Ni41,° °° }) = n|{N1, Ne, nr Ny =, Nji41,° °° }), (7-10b) 
while 


N= SON, (7-10c) 
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is the total number operator: 


N[{1nq, no, °° Ni, N41, -°- }) = (So ni)|{ri, ne, Ni, Nig, -*- }). (7- 10d) 


As a consequence of the commutation and anti-commutation relations 
(for bosons and fermions respectively), the state resulting from the appli- 
cation of a product of the creation and annihilation operators, each raised 
to an integral power (example: (4!)?(a))°a)as) on any fully symmetrized or 
fully antisymmetrized state (for bosons and fermions respectively) con- 
tinues to retain the symmetry or the antisymmetry property (as the case 
may be; in the case of fermions, the application of an annihilation oper- 
ator raised to a power > 2 results in the null vector; the latter is to be 
distinguished from the vacuum state |0)). This provides us with a means 
of representing any completely symmetric or completely antisymmetric 
state with some specified particle number N as one resulting from the 
application of a product of a number of creation operators, each raised to 
some integral power (> 1; the sum of the powers has to be JN) on the vac- 
uum state — a representation far more convenient than the occupation 
number representation of the form (7-6a). The normalization of a state 
generated in this manner is to be fixed in accordance with (7-7b), (7-8). 
A fully symmetric or fully antisymmetric state with an unspecified num- 
ber of particles is then obtained as a superposition of states with various 


different particle numbers, again with appropriate normalization. 


The state space made up of all possible symmetrized or antisymmetrized 


states (as the case may be) corresponding to all possible values of the par- 
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ticle number is referred to as the Fock space. States in the Fock space are 
conveniently generated by operating on the vacuum state with products of 
creation and annihilation operators, raised to various possible powers, as 


explained above. 


In the above paragraphs, there is no expressed preference for any partic- 
ular set of basic single-particle states |u,), |uz),--- , and the corresponding 
annihilation and creation operators 4;,a! (i = 1,2,---). In specific ap- 
plications, however, one finds it convenient to refer to operators in some 
particular representation, such as the co-ordinate or the momentum rep- 
resentation, in terms of which all other relevant operators such as the 
Hamiltonian for a many-body system, can be expressed. The changeover 
from one representation to another is effected by standard rules of trans- 


formation in quantum theory. 


In the co-ordinate representation, the annihilation and creation opera- 
tors, labeled with the continuously varying index r (the position vector), 
are written as 4,(r),a'(r) and are referred to as the field operators (at 
times denoted by 1(r,),'(r,0)), where the spin variable (c) is included 
explicitly. In the case of a spin-+ particle, co takes up two values (+, -, or 


+, |). The transformation equations from {4;}, {a!} to a,(r), at (r) read 


Gel ¥) = So (r, ofei)ai, al(r) = So (ual, oat, (7-11la) 


a a 


while the reverse transformation is 


a, = [ertule, ode), al = [ eeteolusahin, (7-11b) 
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where the notation is self-evident (in the case of spinless bosons the spin 
index o is to be omitted). The commutation or anti-commutation (as 
the case may be) relations are preserved under this transformation. The 
number density operator and the number operator in the co-ordinate rep- 


resentation are given by 


Ar) =) al(r)a.(r), N= | drp(r). (7-12) 


Other than the co-ordinate representation, the momentum representa- 
tion is also frequently made use of. As already mentioned, single-particle 
states in this representation are those of a free particle in a cubical box, 
subject to the periodic boundary condition (other boundary conditions, 
such the fixed boundary condition, are also possible). In this representa- 
tion, the annihilation and creation operators appear as @,(k), ai (k), where 
k stands for a typical 3D momentum vector with components (71, n2, 73) 
and where nj,72,n3 can take up all possible integer values (1 = side of 
the cube, with volume V = L®; the state with n, = no = n3 = 0 is taken 
as the vacuum state). As the volume of the cube is made to go to in- 
finity, the momentum components become continuous variables. In the 
following, we will adopt the discrete-k representation, in which the anni- 
hilation and creation operators for any specified k are written as 4,., dj... 
The transformation equations between the co-ordinate and momentum 


representations read 


1 1 
act). etn a) = a co aL (7-13a) 
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1 ; 1 

Oke = — [are c(0), ato = | d®@re*rat(r), (7-13b) 
L2 L2 

Results of physical relevance for a macroscopic system will, however, 


have to be considered in the thermodynamic limit V(= L?) — oo. 


Referring to the density operator /(r) (first equality in (7-12)), its Fourier 


transform in the momentum representation appears as 
A(q) = / d®@ re" A(r) = S— ah adic: (7-14) 
ko 


We now consider an interacting many-body system of bosons or fermions 


with a Hamiltonian of the form 


# = 4 0.) +5 Dota), (7-15) 
a a y#a 
where the first and second terms are sums of one-particle operators, rep- 
resenting the kinetic energy of the system and its potential energy in an 
external classical field with which each particle interacts independently. 
The third term in the Hamiltonian expresses the potential energy of the 
system in virtue of a two-body interaction, where all pairs of particles in- 
teract identically. We assume that the pair potential o(r,r’) is symmetric 
in r,r’ and depends only on the separation r’ — r. In formula (7-15), the 
particle index a is mentioned explicitly, but now as a sub-index, for the 
sake of convenience. The number of particles is not explicitly mentioned 
in this formula; when considered for a system with a given number (say, 


N) of particles, a, y are to be summed from 1 to N (with 7 £ a). 
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The operators U(r,,) and i(r,,r,) are defined in terms of functions U(r), u(r, r’) 


d 


as 


Ora) =Ulta)lta)(tal, 6asPy) = (tasty )|tyPa) (rar y|- (7-16) 


According to the assumption mentioned above, the potential energy function 
u(r’,r’’) depends only on r = r” —r’. It is the function v(r) whose Fourier 
transform v(q) will be seen to feature in the expression for H in terms of the 
creation and annihilation operators in the momentum representation (see 


eq. (7-21) below). 


In this context, the following results prove to be useful ([114], Appendix 


A): 


Let>. T,, be a sum of single-particle operators, where the sub- 
index a refers to individual particles (as in the first two term 
of (7-15)). This means that there is an operator T for any arbi- 
trarily chosen particle considered all by itself such that, when 
the particles in an assembly are considered severally and are 
distinguished by index a, the action of the said operator, when 
looked at in the context of the state space (H') of the ath 
particle, can be expressed as 7. The sum of the operators 
T, over the particles making up the system can then be ex- 
pressed (in respect of its action on completely symmetrized or 
completely antisymmetrized many-body states) in terms of the 
creation and annihilation operators (relative to a specified ba- 


sis of single-particle states S for an arbitrarily chosen particle 
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considered all by itself) as 


> Te = dul T les) abas, (7-17) 
a ag 
Here the matrix element (u;|T|u;) can be evaluated by referring 


to the space H for any particular value of a and evaluating 


(us? |FJub. 


Similarly, let us consider a two-body operator @,, (a, y = 1,2,---N, af 
7), whose action on a basis vector [rss au) (i,j = 1,2,---) in 


H° ® H7 can be represented as 


Dery tg a) = Ss" Calaran, (7-18) 
kl 


where C;;4: (7, j,k, = 1,2,---) are appropriate superposition con- 
stants, subject to normalization. On the other hand, é,, can be 
looked upon as an operator in [],, 1°) by defining its action on 
a product vector of the form ue? oS on ui”) ---u’) (note the 


way the sub-indices have been named; these take up positive 


integral values) as 


A 1 a N d; a N 
ji ea aa CO ea ay 
kl 


(7-19) 


In other words, 6, (we have assumed a < 7 in writing the above 


formula) leaves unaffected all factors in a product vector except 
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those in H‘°),H™ (recall that all acceptable many-body states 


can be expressed as sums of product vectors). 


If now we consider a many-body operator that can be expressed 
as a sum of two-body operators such as @,,, then it can be ex- 
pressed in terms of the creation and annihilation operators (in 
respect of its action on completely symmetrized or completely 


antisymmetrized states) a;, al as 


S Gay = S- (uets;|O|uxtn) af @hag de, (7-20) 

ya igkl 
where, analogous to the one-particle case, the coefficient (u;u,;|d|u,u) 
can be known by referring, for a specified pair a, 7, to H) @H” 
and evaluating the matrix element (ug as) Dery rut) between 


product vectors. 


Making use of the results (7-17), (7-20), one can express the Hamilto- 
nian (7-15) in the following form in respect of its action on completely 
symmetrized or completely antisymmetrized many-body states in the mo- 


mentum representation ([114], Appendix A) 


ko kqo 
a7 do Dab raed qrdierd (7-21) 
T DV Vd) Be qo Yer — qo! Ue'o! Ake: 

kk’qao’ 


In this formula, U(q),v(q) are Fourier transforms of the functions U(r) 


and v(r) in terms of which the operators U,6 are defined (refer to (7-16) 
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and the paragraph following it). Here the two-body interaction term on 
the right hand side is a summation over terms, a typical member of which 
can be interpreted as representing two ‘incoming’ particles with momenta 
k,k’ interacting with each other with a strength vu(q) and getting trans- 
formed to two ‘outgoing’ particles with momenta k + q,k’—q. The spins 
(co, 0’) and the total momentum remain unchanged in the interaction since 
the Hamiltonian (7-15) dos not involve spins in the interaction part and 
commutes with the total momentum operator of the interacting particles. 
In the following, we will consider applications featuring similar expres- 
sions in terms of creation and annihilation operators in the momentum 


representation. 


1. The expression (7-21) does not involve any reference to the particle 
number, but it commutes with the total number operator N= ‘ at Ako» 
which means that the two operators have common eigenstates that ap- 
pear as eigenstates of (7-15) written for a given number (JN) of particles 
(not explicitly mentioned in (7-15)). 


2. The Fourier transform v(q) is defined as 


v(q) = f a®re-'#*o(0), (7-22) 


while U(q) is defined similarly. 
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7.3 Weakly interacting Bose gas at low temper- 


atures 


7.3.1 Weakly interacting Bose gas: introduction 


The behavior of an ideal Bose gas at low temperatures was considered 
in sec. 3.3.7.2 where the phenomenon of Bose-Einstein (BE) condensa- 
tion was encountered. The ideal Bose gas is characterized by a num- 
ber of odd features (e.g., an infinitely large compressibility below the BE 
transition temperature) in virtue of the absence of interaction among the 
molecules. In the present section, we will have a look at the energy spec- 
trum of a weakly interacting Bose gas where the idea of quasi-particles, 
analogous to the phonons in a solid (refer back to section 3.4.1.3), as 
vehicles of excitation, is made use of. A quasi-particle is characterized 
by a momentum p associated with a wave vector k(= ?) labeling possible 
single-particle states, and an energy «(p). For a dilute gas at sufficiently 
low temperatures the quasi-particles are independent of one another, and 
the energy spectrum of the gas is obtained by working out the «(p)-p rela- 
tion (referred to as the ‘dispersion formula’ below) for the quasi-particles. 
Following [113], [67] (See also [114], [77]), we sketch the derivation of 
the dispersion formula, making use of ideas initiated by Bogoliubov. The 
derivation is based on the assumption that the gas suffers a BE con- 
densation at such temperatures. From a fundamental point of view, BE 
condensation is a consequence of the symmetry requirement on the wave 
functions of the assembly of bosons, which continues to play its role 


when the molecules of the gas interact among themselves. Among other 


759 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


things, the «(p)-p dispersion relation implies the possibility of superfluid- 
ity which, however, differs in a number of ways from the superfluidity of 


liquid He-4, to be briefly discussed in sec. 7.4. 


7.3.2 Weakly interacting Bose gas: the scattering length 


approximation 


1 
Y)s, 


A dilute gas satisfies the criterion rp << ( e., the range of inter- 
molecular interaction (79) is to be small compared to average separation 
between the molecules (V, N stand for the volume of the gas and the num- 
ber of molecules in it), where it is assumed that interactions between 
more than two molecules at a time can be ignored. The molecules interact 
by means of two-body scattering, which, at sufficiently low temperatures 


(T << ), can be assumed to be entirely described by the s-wave scat- 


2m i Imkpre 
tering length a which is assumed to be positive. The criterion for weak 


interaction can be stated as a << (¥)s. 


We begin from the expression for the Hamiltonian (7-21) while considering 
a gas made up of spinless bosons in a volume L? = V, in the absence of 


an external field acting on them. This gives, 


Pk. 1 7 , be 
H= 2 al ax 4 TH de MD Ber gee fe he. (7-23a) 


kk’q 


In this expression v(q) denotes the Fourier transform of the two-body po- 
tential V(r) describing the interaction between the bosons. However, in 


the approximation mentioned above, the detailed structure of the poten- 
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tial is not relevant so long as the scattering length has a specified positive 
value (a), and one may replace the actual potential, possibly involving a 
hard core with a sharp boundary, with a smooth repulsive potential V.g¢(r) 
describing the effective interaction between the bosons. Further, since 
single-particle states with only very small values of the momentum com- 
ponents are relevant for the temperature range we are interested in, the 
momentum transfer q is close to zero in the summation in the second 
term of the above equation. Assuming that vu(q) is sufficiently smooth in 
q, we approximate wv(q) ~ vp for q 0, where vp is a constant, and write 
the Hamiltonian in the form 


‘ Prk? «, v ae x a 
H= y 57 Ges gdiy qk’ Ue: (7-23b) 


kk’q 


The constant vp can be related to the scattering length a ([113]) by adopt- 


ing the Born approximation which, in the leading order, gives 
Arh? 
v = ——, (7-24) 


m 


7.3.3 The Bogoliubov approximation scheme 


Following Bogoliubov, we now make the crucial assumption that the 
single-particle ground state (k = 0) is populated by a macroscopic fraction 


of the total number of particles, as in Bose-Einstein (BE) condensation. 


A weak interaction among the particles making up the gas introduces small 
changes in the various features of the BE condensation in an ideal gas (refer 


back to section 3.3.7.2), among which some get modified in a notable man- 
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ner. The perturbation theory to be outlined below shows that the isothermal 
compressibility, which is infinitely large below the transition temperature in 
the case of the ideal Bose gas, now becomes finite, and the specific heat, 
which changes continuously across the transition temperature for the ideal 
gas, now undergoes a finite discontinuity. As in the case of the ideal gas, we 
assume that the single-particle ground state is macroscopically populated 
and, as for the higher energy single-particle states, only the low-lying ones 
are populated and that too sparsely. In this context, recall that the states of 
the assembly of bosons can be specified in terms of the occupation numbers 
of single-particle states. An equivalent description would be in terms of the 
numbers of quasi-particles (see below) with various values of momentum 


and energy. 


If the population of the single-particle ground state be denoted by ny then, 
in the leading approximation (see below for the perturbative correction to 
the leading approximation; the possibility of the transition to superfluid- 


ity appears only in consequence of this correction) 
no = N, (7-25a) 


and the ground state energy of the Bose gas is obtained from (7-23b) as 


upN2 2Ih?aN? 2rh?aN ve 
= = — UV = 
QV mV mv N”’ 


Eo (7-25b) 


where the second equality is obtained by making use of (7-24). 
Eq. (7-25b) is obtained by approximating do, ai with the c-number ,/no ~ 
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VN in the leading order (which implies that the populations of the single- 
particle excited states can be taken to be all zero) and considering only 
the term k = k’ = q = 0 in the summation over k,k’,q while ignoring all 


the other terms, which are of smaller order. 


At T = 0 the free energy in this leading approximation is F' + Ey and one 


obtains the pressure of the weakly interacting Bose gas as 


OF 2nh?a . gn? 
ecco = 


= 7-26 
4 OV m a? ( ) 
where n(= ~) stands for the number density of the molecules, and 
2 
g= Arh nd (7-27a) 
m 


is an effective coupling constant which, in the lowest order of perturba- 


tion, is given by 


[lowest order :] g = Up. (7-27b) 


Note that (7-26) expresses a macroscopic property of the gas in terms of 
the single microscopic parameter g. The formula (7-27b) gets modified in 
the next order of perturbation (see eq. (7-31) below). Expressed in terms 
of g (or, equivalently, of the scattering length a), the results of the present 


section remain valid for a point-like hard sphere repulsion. 


One can also work out the excited states in this leading order of approxi- 
mation and then obtain the partition function and other thermodynamic 


functions ({67], chapter 12). In particular, an isothermal curve of the 
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weakly interacting Bose gas looks as in fig. 7-1 where a typical isotherm 
for the ideal Bose gas is also shown for the sake of comparison. Note how 
the two isotherms differ below a critical value of the specific volume (wv, 
which is assumed to be the same for the two graphs): the compressibility 
for the weakly interacting Bose gas is finite, while that for the ideal gas is 


infinitely large. 


Figure 7-1: Depicting a p-v isotherm for a weakly interacting Bose gas, at a tem- 
perature T lower than the BE condensation temperature Tp (solid line; schematic); an 
isotherm (schematic) for the ideal Bose gas is also shown for the sake of comparison; 
the two isotherms differ notably below a critical value of the specific volume (¢, which 
is assumed to be the same for the two graphs): the compressibility for the weakly inter- 
acting Bose gas is finite, while that for the ideal gas is infinitely large; based on [67], fig. 
12.10. 


Having had a look at the leading order of the approximation scheme un- 
der consideration, we now consider the next order in which we set two of 
the four operators in each term of the summation over k, k’, q in (7-23b) at 
VN while retaining the other two —a procedure dictated by consistency. 
At the same time the leading term in the summation, with k = k’ = q =0 


is to be replaced not with N? (as in the lower order of approximation 
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considered above) but, for the sake of consistency, with 


re N?-2N Saban, (7-28) 
k 


where the prime over the summation means that the term with k = 0 is to 


be excluded. With these substitutions in mind, we obtain, from (7-23b), 


: is VO ntata a He ohne 
[next higher order :] H = ay ahdoi%n : » Da al ax 
/ 
a » [4ahaoal aye + dodoahal, + ahahaxar], (7-29) 
i.e., 
x. gN? Wk? ON Aponte tat a mgN 
H= Ta om al ax | wv ye [24),ax + Gti, + G,.G__, + sree (7-30) 
k k 


(check these relations out) where we have made use of (7-28) and the 
following relation between vp and g in the next order of approximation 


to (7-27b) (refer to [113], eq. (4.21)), 


gym 


It now remains to diagonalize the Hamiltonian (7-30) in order to obtain 
the energy spectrum to the order of approximation mentioned. This is ac- 


complished by making use of the Bogoliuwbov transformation and defining 
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a new set of operators by, bt by 


Gi = Pade t+ debi, Gh = pbk + O1b—K, (7-32a) 


for all k, such that (a) the new operators bx, bt satisfy bosonic commuta- 
tion rules as do the old operators 4,, ahs and (b) the Hamiltonian (7-30), 
expressed in terms of the new set of operators, is diagonal, containing 
only c-numbers, and operators of the form bt by. Here px, q@ are real num- 
bers to be chosen so as to satisfy these two requirements. It turns out 


that this is possible with the choice 
Dk = coshux, Gx = sinh ux, (7-32b) 


where the parameters u, are given by 


We on 
coth 2u, = _ 2m 7 (7-32c) 
gn 
corresponding to 
hk? ai 
: gn 
cosh 2u, = 2m , sinh 2u, = ' (7-32d) 
(ete on) (SE)? + EB gn): 


(check this statement out). Once the diagonalization is achieved, one 
can interpret by, bh as the annihilation and creation operators for non- 
interacting quasi-particles of momentum fk, where these quasi-particles 
represent the effect of the interaction between the bosonic constituents 


of the weakly interacting gas we started with. The effective Hamiltonian 
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Hu (i-e., the Hamiltonian operator in the approximation used in arriving 


at (7-30)) now appears in the form 
Hew = Ey +S — e(k) bj bx, (7-33a) 
k 


where £, denotes the ground state energy and e(k) the energy of a quasi- 
particle of momentum hk. We write the expressions for Ey and e(k) result- 
ing from the exercise outlined above (try the algebra out) in terms of the 
scattering length a, related to the effective coupling constant g by (7-27a) 
(the renormalized relation (7-31) between g and vp yields a corresponding 


relation between vp and a; v denotes the specific volume ¥(= +)): 


/ 


Qnah2N Qnah? hk? = 882 a7 hi? nh | ya 167ah? 
Eo = mu + d[- mv 4m mv2k2 *° 4m est v F 
(7-33b) 
hk 167ah? 
ah = ea rea? 4 (7-330) 
2m Vv 


The stationary states of Hee are then described in terms of the numbers 
(,) of quasi-particles of momentum fk (the momentum acquires rele- 
vance in describing interactions among the quasi-particles which, how- 
ever, is absent in the present approximation) where 1, are the eigenvalues 
of bt dy. The ground state, of energy Ey (formula (7-33b)) corresponds to 
VY, = 0 for all k. 


In the expression for the ground state energy Ey), which includes the 


correction to (7-25b), there appears the divergent term )>,. Sarah which, 
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however, gets canceled with an opposite contribution coming from the 


expansion of ,/h?k? + 1enah” in powers of cas In the limit of V — ov, the 


sum over k reduces to an integral in accordance with the prescription 


yk Oo f dk (check this out) and one gets ([67]) 


(ny 


128 ja? 


- 2nah? N ) 
liy/a You"? 


Ey & (1+ 


TMU 


(7-34) 


One observes that the small parameter featuring in the correction is a = 
na’, the smallness of which is precisely the condition, mentioned earlier, 
for the inter-molecular interaction to qualify as a weak one. In a more 
elaborate calculation of the ground state energy one obtains terms of 
higher orders of smallness following the correction term in (7-34), the 


next term being of the order of a In - 


Turning now to the formula (7-33c), written in terms of the momentum 


variable p instead of k(= ?), one obtains the required «(p)-p dispersion 


formula for the quasi-particles: 


2 
e(p) = > P24 uae (7-35a) 
m U 


This implies a graphical representation of the general nature of fig. 7-2 


where there is a linear part for p ~ 0 given by 


1 [anak 
e(p) ¥ cp, c= — o, (7-35b) 


m U 


c being the velocity of sound, i.e., the collective excitation carried by the 


quasi-particles which we refer to as ‘phonons’. 
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The velocity of sound in a fluid is given by the hydrodynamic formula 


Op 
— ——+t 7-36 
c Dp’ (7-36) 
where p = ™ (v = ¥ = +) stands for the density of the fluid. With p given 


by (7-26) for a weakly interacting Bose gas, cis found to be given precisely by 
the second equality in (7-35b). One can use (7-34) to calculate the pressure, 
and hence the velocity of sound in the next order of approximation, and that 
too agrees with what one derives from the spectral formula, treated with the 


appropriate degree of accuracy. 


On the other hand, at large p, «(p) the energy of a quasi-particle ap- 
proaches a the momentum of a free particle. Indeed, for large p, the 
dominant mode of excitation is by means of momentum gain by the in- 
dividual particles constituting the gas rather than by means of collec- 
tive excitations. This is apparent by noting that, for k > o, the quasi- 
particle creation and annihilation operators bf, b,, reduce to particle cre- 
ation and annihilations operators @j.,a, since the coefficients of trans- 
formation (7-32a) now become p, > 1,q — 0. The cross-over from the 
phonon to the article mode of excitation occurs at around p = V2mce, 


corresponding to a spatial length scale € = a referred to as the healing 


length. 


The excitation spectrum for a weakly interacting Bose gas depicted schemat- 
ically in fig. 7-2 differs from the excitation spectrum for liquid He-4 ob- 
tained experimentally (see fig. 7-5) where one finds a non-monotonic varia- 


tion of e(p) which implies, in addition to the phonon-mediated excitations, 
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e(p) 


Figure 7-2: Depicting the ¢(p)-p relation for a weakly interacting Bose gas as described 
by formula (7-35a) (schematic) where the low-energy phonon-based excitations, with 
€(p) ~ cp are seen to be demarcated from the high-energy particle-mediated excitations, 


with e(p) ~ Z by the intervening cross-over region around p = V2mc; based on [113], 
fig. 4.3. 


a second type of collective excitations referred to as rotons. This implies 
the existence of a critical velocity v. for liquid He-4 lower than the sound 
velocity in it such that for a flow velocity less than v, the liquid flows 
without dissipation (in reality, the critical velocity for liquid He-4 is less 
than the one obtained from the excitation spectrum in virtue of another 
type of excitation, namely, vortices, to be discussed below). The critical 
velocity for a weakly interacting Bose gas, on the other hand, equals the 
sound velocity, as seen by making use of the so-called Landau criterion 
(see sec. 7.4.2 below). Finally, the critical velocity for an ideal Bose gas is 
zero by the same criterion since all excitations in an ideal gas are of the 


single-particle type. 


The effective Hamiltonian (7-33a) is in the diagonal form, with eigenval- 
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ues 


' 

Etny = Ex +) npe(p), (7-37) 

p 

where {n} stands for a set of occupation numbers n,, specified for all p 
of the form p = Soe, nie; (nj = 0,41, +2,--- for i = 1,2,3), representing 
possible momentum vectors corresponding to single-particle states, sat- 
isfying periodic boundary conditions, for free particles within a cube of 
edge length L(= V5), é,(i = 1,2,3) being unit vectors parallel to the edge of 
the cube. As mentioned above, an eigenvalue of the above form may be 
interpreted as being the energy of an assembly of quasi-particles, where, 
for each allowed momentum p, there occur n,(= 0,1,2,---) number of 


quasi-particles in the assembly, having momentum p and energy ¢(p). 


Evidently, to the approximation under consideration, the quasi-particles 
are non-interacting, since all the possible momenta contribute to the 
sum (7-37) independently of one another. Recall that the expression (7-33a) 
is obtained under the assumption of weak two-body-interaction that can 
be described completely in terms of the s-wave scattering length a, the 
approximation under consideration being characterized by the small pa- 
rameter na?(n = vr). Moreover, the theory assumes the temperature to be 
sufficiently low such that n, ~ 0 for all but the lowest momentum values. 
More specifically, states with momenta up to the value corresponding to 


m = V¥na? are assumed to be relevant ([113], chapter 4). 
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7.3.3.1 Weakly interacting Bose gas: thermodynamic functions 


Making use of the formula (7-33a) one can treat the weakly interacting 
Bose gas (consisting of N partcles in a volume V) as an ideal gas of quasi- 
particles with ground state energy Ey given by (7-33b). The number of 
quasi-particles, however, is not fixed (analogous to the photons describ- 
ing the state of an electromagnetic field) since a thermodynamic state 
is described as a probability distribution over microscopic states corre- 
sponding to all possible sequences {n}. One can now work out the ther- 
modynamic parameters characterizing an equilibrium state of the weakly 
interacting Bose gas (recall that we have already obtained the pressure in 


the leading approximation in (7-26)). 


For instance, using v = + in (7-34), the chemical potential per particle ju 


of the weakly interacting Bose gas at T ~ 0 is obtained as 


_ OF OE Arah? 2 5 
k= On = zs lA na? || (7-38) 


The formula p = ae holds in the limit NV — oo. 


One observes that the chemical potential is small but positive (as T — 0), 
in contrast to the ideal Bose gas (of, say, N number of particles in volume 
V) where the chemical potential is negative (less than the single-particle 
ground state energy in magnitude) and tends to zero as T — 0 (refer back 


to section 3.3.7.2). 


The free energy of the weakly interacting Bose gas for given values of 
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v =X, T is obtained as 
P=U=7S=940'=Ts (7-39) 


where U’ and S stand for the energy and entropy of the ideal Bose gas 
constituted of the elementary excitations. This gas has chemical potential 
js = O (since the number of elementary excitations is not fixed) and the 
mean number of excitations in a single-quasiparticle state of momentum 
p is given by 


lp = Say (7-40) 


One then obtains 


, e(p) 
U= oe (7-4 1a) 


while the entropy S (which is also the entropy of the weakly interacting 


Bose gas we started with) is obtains as 


1 € 
S = kp So In(1 — eF) + a Ss tt) - (7-41b) 
p p 


The result (7-41b) is obtained by making use of formulae (3-44) and (3-48b) 
in which we put , = 0, and then invoking the formula S = —$2 = 4(8-!InZ,), 


where {2 stands for the grand partition function. A sum over index a now 


stands for a sum over p. 


In the limit V — oo, one replaces }),, > Tans fd p,thereby obtaining the 
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results 


V e(p) 
—_ () a : 
eT, je Pag (7-42a) 
V 1 e(p pe 


and, finally, for the free energy of the weakly interacting Bose gas, 
F=E,+ ga _ / d®)p In(1 — e~F«)) (7-42c) 
(27h)? 


(check these results out). At sufficiently low T, only small values of 
momenta are relevant in working out the integrals, in which case one 
can replace «(p) with cp, corresponding to the phonon part of the en- 


ergy spectrum (recall that, at higher momenta, the energy spectrum re- 


sembles that of a free particle). Making use of the results Hide = 
x, fp’ @dzn(1-e-*) = —=, one obtains the following expressions for 


the internal energy and the free energy of the weakly interacting Bose gas 


nekaTAV nekaT4V 
U=£yotU' =F + ———, P= ey - 7-43 
oF 0+ Re” o 90h” ( ) 


where F, is given by (7-34) (check this statement out). 


The specific heat heat at constant volume is obtained by differentiation 
from the expression for U, and is given by 


_ OU _ @n’kgV 3 


er ee 7-44 
oe OT 15h” ( ) 
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expressing the T° law that applies, in addition to the present context, to 


the specific heat of a crystalline solid and to black body radiation. 


7.3.3.2 Weakly interacting Bose gas: BE condensation 


In working out the energy spectrum of the weakly interacting Bose gas by 
invoking the Bogoliubov approximation scheme, we have assumed that 
the ground state is populated by a macroscopic number of particles, i.e., 
the gas is below the transition temperature for BE condensation. It is of 
interest to briefly compare a number of features of the weakly interacting 


Bose gas with the ideal Bose gas (refer to section 3.3.7.2). 


As we have already seen, the weak interaction between the constituents 
of the gas causes a shift in the ground state energy of the system (for- 
mula (7-34)), implying a non-zero chemical potential (formula (7-38)) and 
a finite compressibility (refer to (7-26), which gives the equation of state 


in the leading approximation). 


The BE condensation temperature (7)) itself gets shifted due to the weak 


interaction ([75]) by 


6Ty = —CTpan3, (7-45) 


where C(® 1.29) is a constant. One notices again the appearance of the 


perturbation parameter ane resulting from the interaction. 


Finally, the weak interaction results in a change in the extent of conden- 


sate depletion([113]), i.e., the fraction of particles excited away from the 


7795 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


single-particle ground state. At T = 0 the condensate depletion has a 


non-zero value given by 


N-No_ 8 py 
yy = 3 gv, (7-46) 


which differs from the case of the ideal gas, for which the condensate de- 
pletion is zero at T = 0. In the above expression, N, No stand respectively 
for the total number of particles and the number in the single-particle 


ground state. 


At higher temperatures (0 < T << mc’), the fractional change in the con- 
densate population is given by 


N(T=0)  ——«*L2nch3’ ae, 


for which the T? dependence differs from the T? dependence noted earlier 


(eq. (8-88)) in the case of the ideal gas. 


7.4 Superfluidity 


7.4.1 Superfluidity: introduction 


Superfluidity was discovered by observations on liquid He-4 at the low- 
est attainable temperatures when it was found to possess remarkable 
flow properties. Helium-4, which does not solidify under normal pres- 
sures in virtue of very weak attractive interaction between its atoms and, 


additionally, due to the small mass of the helium atoms that causes a rel- 
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atively large zero-point energy, undergoes a phase transition at around 
2.18 K, across which its shear viscosity drops to zero and its specific heat 
shows a logarithmic discontinuity (referred to as a ‘lambda transition’) as 
in fig. 7-3. At temperatures close to absolute zero, the specific heat has a 
T?-variation, as is expected of a non-interacting gas of bosons with zero 


chemical potential (the ideal phonon gas; refer to eq. (7-44) below). 


Liquid He-4 below the transition temperature (7)) is referred to He-II, to 
ditinguish it from He-I, the liquid phase above 7). 


specific heat 


-—T* variation T, 


Figure 7-3: Depicting the variation of specific heat (measured along the vapor pres- 
sure curve) of liquid Helium 4 around the transition temperature T\ at which there 
occurs a logarithmic singularity; close to absolute zero, the specific heat varies as T°, 
as in the case of an ideal phonon gas; based on [67], fig. 13.3. 


Immediately after the discovery of superfluidity, Landau proposed an ex- 
planation of the phenomenon that has proved to be of lasting value. His 
theory starts from the consideration of elementary excitations in a liquid 
made up of bosons where these excitations involve collective modes at the 
lowest attainable temperatures, analogous to phonons in a crystalline 


solid. When the liquid flows past an obstacle or a boundary, such as the 
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walls of a capillary, it tends to slow down due to the creation of such el- 
ementary excitations that transfer their energy, acquired from the mass 


motion of the liquid, to the body in contact. 


Referring to the Galilean transformation properties of energy and momen- 
tum Landau showed that, depending on the spectrum characterizing the 
elementary excitations, the creation of such an excitation may not be pos- 
sible for sufficiently low values of the relative velocity between the liquid 
and the obstacle. Based on phenomenological considerations, Landau 
inferred the form of the excitation spectrum (i.e., the dependence of the 
energy, «, on the momentum, p, of the excitations) that could explain the 
existence of a critical value of the relative velocity below which the transfer 
of the energy of mass motion of the liquid flowing through a tube is not 
possible to the wall of the tube by means of the elementary excitations. 
He showed that the likely form of the spectrum could explain not only 
the existence of a critical velocity below which the viscous drag vanishes 
and unusual flow properties appear but, at the same time, a number of 


thermodynamic properties below the transition temperature 7). 


In sec. 7.4.2 below, we briefly outline Landau’s explanation of origin of 
the critical velocity and, at the same time, indicate how a particular form 
of the «-p spectral relation characterizing the elementary excitations can 
imply the existence of a non-zero critical velocity less than the velocity 
of sound, as found experimentally in the case of liquid helium. We then 
introduce the two-fluid model of superfluidity and indicate, in sec. 7.4.3, 
how a number of thermodynamic properties of a superfluid can be ex- 


plained on the basis of the assumed spectral form and the two-fluid 
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model. This will be followed by brief (and sketchy) comments on the 
relation between BE condensation and superfluidity (sec. 7.4.4), and on 


quantum vortices (sec. 7.4.5). 


7.4.2 superfluidity: excitation spectrum and the criti- 


cal velocity 


A liquid qualifies as a superfluid if it flows through a capillary without any 
viscous resistance, such a flow being possible only if the relative velocity 
between the liquid and the capillary is less than a certain critical velocity 
v. Which, in turn, requires that a set of appropriate conditions be ful- 
filled, as we see below. The mechanism of slowing down of the liquid can 
be understood by referring to the production and decay of excitations in 
it. Compared to its ground state, the low lying excited states of the liquid 
can be accounted for in terms of elementary excitations resulting from 
collective motions, the lowest excitations being sound waves of velocity c 
with various wave vectors k and frequencies w = c|k|. Such an elementary 
excitation can be looked upon as a quasi-particle (a ‘phonon’) with energy 
€ = hy and momentum p = hk. The dispersion formula ¢(p) = cp gets mod- 
ified for the higher excitations, and the «-p relation of a certain type over 
an appropriately broad range of frequencies was identified by Landau as 
being of crucial relevance for the existence of a non-zero critical velocity 
which, in the case of liqud He-4 was found to be less than the velocity of 


sound. 


The possibility of the existence of a critical velocity can be inferred from 


the Galilean transformation formulae for energy and momentum of a 
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body between any two inertial frames of reference. Considering frames 
of reference K, K’ with a relative velocity V of the latter with reference to 


the former, the transformation formulae read 
1 
BBN MY, P’=P- MV, (7-48) 


(VM is mass of the body in either frame; E, P are the energy and momen- 
tum in K; E’, P’ are the energy and momentum in K’; check these relations 


out). 


As an application of these transformation formulae, let a mass m of liquid 
flow through a capillary with velocity v relative to the latter (refer to fig. 7- 
4), and let K be the frame moving with the liquid. If now a quasi-particle 
of momentum p and energy «(p) be created in the liquid, then the energy 
and momentum of the moving liquid, as seen in the frame K’ attached to 
the capillary will be 


1 
E'=(p) + piv + 5mo", p =p+mvy, (7-49a) 


where the energy of the liquid in its ground state is assumed to be zero 
for the sake of simplicity. Comparing these with the energy and momen- 
tum (as seen from the capillary frame) in the absence of the excitation, 
one finds that the creation of the excitation as a spontaneous process is 


possible only if 


e(p) +p-v <0, (7-49b) 
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because only then can the excess energy be dissipated to surrounding 


systems (the wall of the capillary in the present case). 


Figure 7-4: Depicting the flow of liquid of mass m with velocity v through a capillary 
tube; K,K’ are inertial frames attached to the liquid mass and the capillary; an elemen- 
tary excitation of momentum p and energy «(p) is generated spontaneously in the liquid 
if the condition (7-49b) is satisfied; p is shown to be in a direction opposite to that of 
v since, if the condition is not satisfied in this case then it will not be satisfied for any 
other direction of p for given values of p and v. 


For any specified values of |p| = p, |v| = v, the minimum value of the left 
hand side of (7-49b) is «(p) — pv corresponding to the situation where the 
vectors p and v are directed oppositely to each other. Thus if, for any 
given v, ¢(p) > pv for all values of p, then elementary excitations cannot be 
produced spontaneously in the moving liquid which cannot consequently 
be slowed down during its flow through the capillary. In other words, 
the production — and subsequent decay — of elementary excitations will 
not be possible if the velocity of flow of the liquid relative to the capillary 


P. 


is less than min, “2 which, therefore, can be identified as the critical 


velocity v.. Evidently, min, (= v-) has to have a positive value for a 
liquid to qualify as a superfluid. In addition, observations on liquid He-4 
below the transition temperature 7) reveal that the critical velocity is less 
than the velocity of sound in it, which is the velocity of propagation of 


excitations of the lowest possible momenta (p —> 0). 


The requirement that min, oh is to have a non-zero positive value (v,) less 
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than the velocity of sound is satisfied if the graph depicting the variation 
of the energy « of a quasi-particle as a function of its momentum p looks 
like the one in fig. 7-5, which gives the excitation spectrum of liquid He-4 
at T ~ 0. In this figure, the slope of the tangent at the origin, giving the 
velocity of sound (c), is larger than the slope at the point P, the tangent at 
which passes through the origin. It is the slope at P that gives the critical 
velocity vu, since one has, at P, a = 5, which is precisely the condition for 


; to be minimum (a global minimum for the graph shown in the figure). 


&(p) 


Figure 7-5: Depicting the nature of «-p graph that implies the existence of a positive 
value of the critical velocity v. = min, el, the critical velocity corresponds to the point 
P at which the tangent to the graph passes through the origin (reason this out), and 
is given by the slope of the tangent at this point; the critical velocity, moreover, is less 
than the velocity of sound (c) which corresponds to the slope of the graph at the origin; 
because of the isotropy of the liquid, the e-p graph looks the same for all directions of 
p; the energy gap A is shown (refer to (7-50a)); the graph depicts schematically the 
excitation spectrum of liquid He-4 at T ~ 0. 


As mentioned above, the qualitative nature of the «-p graph necessary to 
explain various features of superconductivity was inferred phenomeno- 
logically by Landau whose theory was found to be a fundamentally sound 
one by subsequent developments in the field. One observes that the 


graph 7-5 possesses a minimum, of value ¢y;, = A at a non-zero posi- 
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tive value of p = p) around which one can write 


e(p) = A+ (p= Poy’ Poy" (7-50a) 
2p 


with p > 0, while at low values of p (= 0) the variation is linear: 
e(p) © cp, (7-50b) 


corresponding to elementary excitations, analogous to acoustic phonons 
in solids (in a crystalline solid, however, there generally occur three differ - 
ent branches representing dispersion curves for long wavelength acoustic 


phonons). 


Evidently, the «-p dispersion relation is qualitatively different in the two 
regions p ~ 0 and p & po since, in the former, the dispersion relation is 
linear while in the latter, it is quadratic, with a minimum at p = pp,e = A. 
The elementary excitations in these two regions are referred to, respec- 
tively, as phonons and rotons, the latter being characterized by a non-zero 
energy gap A. It is this non-zero energy gap that tells us that the exci- 
tations acquire a distinctive character at relatively large values of the 


momentum. 


The e-p dispersion curve for liquid He-4 can be obtained experimentally 
by means of inelastic slow neutron scattering and does indeed look like 
fig. 7-5, but the critical velocity obtained from the experimental curve is 
found to be larger than the actual critical velocity of He-4 by two orders of 


magnitude. This is caused by the production and decay of another mode 


783 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


of excitation in He-4, namely, quantum vortices. 


1. In an inelastic slow neutron scattering experiment a beam of slow neu- 
trons is made to interact with the liquid He-4 sample and the en- 
ergy and intensity of the scattered neutrons is measured in numer- 
ous different directions. If a neutron produces an elementary excita- 
tion of momentum p and energy «(p), then this will correspond to a 
peak in a certain definite direction and for a certain definite energy 
of the scattered neutron, determined by the conservation principles. 
In other words, the excitation spectrum determines the differential 
scattering cross-section for various possible scattered neutron ener- 
gies. The differential scattering cross-section, in turn, is related di- 
rectly to the dynamic structure factor S(k,w), which is the space- and 
time- Fourier transform of the density-density correlation function of 
the liquid ( [114], chapter 12; [113], chapters 7, 8; [133]; see also 
section 8.4.10 of the present book). Slow neutrons are necessary to 
produce low-lying excitations of the liquid, held at a temperature be- 
low the 4-transition. 


2. There exists a large body of literature on the experimental and theoret- 
ical determination of the excitation spectrum of liquid He II, the latter 
in terms of the dynamic structure factor. An important pioneering 
contribution was from Feynman ([113], chapter 8) who worked out an 
expression for the excitation spectrum in terms of the static structure 


factor S(k) 


S(k) = af. S(k, w)dw. (7-51) 


While Feynman’s formula gives an accurate description for the exci- 
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tation spectrum of a weakly interacting Bose gas, it applies to liquid 
Helium II only at the lowest energies (the phonon part of the spectrum) 
and deviates more and more as one moves to higher energies, over- 
estimating the excitation energy «(p) by a factor of two near the roton 


part. 


Referring in this context to the «(p)-p spectrum for a weakly interacting 
Bose gas (formula (7-35a), depicted graphically in fig. 7-2), one observes 
that the global minimum of ae as a function of p occurs at p > 0, which 
implies that the critical velocity in the case of the Bose gas is no different 
from the velocity of sound. Moreover, the ideal Bose gas cannot be a 
superfluid since all excitations are particle-like, with a dispersion relation 
e(p) = ze. The relationship between the phenomena of BE condensation 


and superfluidity is intimate but complex in nature. 


7.4.3 Superfluidity: the two-fluid model 


When liquid He-4 flows through a tube, the thermally generated excita- 
tions in it are carried by a part of the flowing mass that suffers viscous 
resistance as the excitations give up their energy to the wall of the tube, 
while the rest, with no excitations in it, flows as a separate entity without 
resistance. In other words, at any given temperature, T (0 < T < 7)), 
He II can be looked upon as being made up of two components — the 
superfluid component and the normal component, with distinct sets of 


thermodynamic parameters. 
At T = 0 the superfluid constitutes the entire mass of He-II since there 
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are no thermally generated excitations. At a slightly higher temperature, 
one can have an equilibrium situation when the two components are not 
in relative motion, while the two appear as distinct components when 
in relative motion with velocities, say v,,v, of the superfluid and normal 
components relative to the wall of the tube through which we assume He- 
II to be flowing. In the absence of an external driving force, the relative 
velocity between the normal component and the wall eventually becomes 
zero, while the velocity of the superfluid component relative to the normal 
component (v,—v,) continues to remain at some non-zero value. A config- 
uration with such a relative motion between the two components appears 
as one of equilibrium, since it can persist for an inordinately long time 
before the relative motion dies down (the two components interact among 


themselves only at a higher degree of approximation). 


Truly speaking, however, this corresponds to a quasi-equilibrium config- 
uration where the motion of the superfluid relative to the normal compo- 
nent is to be treated as an out-of-equilibrium evolution that can persist, 
in the first approximation, for an indefinitely long time like the flow of an 
ideal fluid (i.e., one of zero viscosity; refer to section 8.3.7.1 where the 
ideal fluid approximation of hydrodynamic evolution is outlined). One 
can apply the principles of equilibrium statistical mechanics to the nor- 
mal component to describe and analyze several features of He-II, treating 
the relative velocity w = v, — v, as an additional thermodynamic parame- 
ter, and assuming |w| to be small so as to retain only linear terms in w in 
the description (the relative velocity w acts as a constraint in determin- 


ing the equilibrium properties of the normal component), in which case 
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the principles of linear evolution in non-equilibrium statistical mechan- 
ics can be invoked in respect of the superfluid component. The normal 
component may be assumed, for the sake of simplicity, to continue to be 


at rest relative to the flow tube. 


The principles of non-equilibrium statistical mechanics describing the time- 
evolution of systems close to equilibrium are enunciated in chapter 8. In the 
present context, the relative motion between the superfluid and the normal 
components of He II can be described as an ideal flow that appears as the 
first approximation in the statistical mechanics describing hydrodynamic 
processes (refer to sec. 8.3.7.1) in the so-called linear response regime, i.e., 
one where a linear relation can be assumed to hold between the fluxes and 
thermodynamic driving forces. In the following, we will not be concerned 
with the hydrodynamic description, and will only look at the description of 
the equilibrium configuration of the normal component of He-II for a van- 
ishingly small relative velocitiy w. The hydrodynamics of the superfluid 
component, while of overriding relevance in explaining the properties of liq- 
uid He-4, will not be taken up in this book (for an outline, see [117], chapter 


10) except for a brief mention of the vortex excitations. 


The mass density p of He II can be expressed as the sum of two terms 


corresponding to the mass densities of the two components 


P= Ps + Pn; (7-52a) 
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while the momentum of the fluid per unit volume (j) is 
J = PsVs + PnVn- (7-52b) 


The momentum carried by the normal component (i.e., the gas of excita- 


tions) per unit volume is also given, in the thermodynamic limit, by 


: 1 
in = 7 | A ppn(p), (7-53) 
where n(p) is the mean number of excitations of momentum p in the 


frame of the normal component. 


In order to proceed further, recall that the energy spectrum ¢(p) refers to 
excitations produced in the rest frame of the superfluid. As we saw, for 
a sufficiently small value of the superfluid velocity, new excitations can- 
not be produced, because such a process is not energetically favorable. 
At any finite temperature (7 < 7); we assume that T is sufficiently away 
from 7}) the excitations produced thermally can be described as a gas of 
non-interacting phonons and rotons, forming the normal component of 
He-II. Considered in the frame of the normal component which, at ther- 
modynamic equilibrium, is at rest with respect to the capillary through 
which we assume the superfluid to be moving, the energy of an excitation 
of momentum p, is «(p)+p-(vs—vn) (reason this out; e(p) and p refer to the 
superfluid rest frame). As in the case of a weakly interacting Bose gas, the 
excitations behave as bosons (recall from sec. 7.3.3 that the creation and 
annihilation operators of the excitations satisfy boson commutation rela- 


tions). Moreover, the chemical potential of the gas of excitations is zero 


788 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


since these are quasi-particles whose number is not fixed (in contrast to 
Helium atoms whose number can be specified). Consequently, the mean 
number of excitations of momentum p and energy «(p) in the gas of ex- 
citations in thermal equilibrium (i.e., in the normal fluid component) is 
given by (refer to formula (3-49)) 


1 


n(p) = ehle(e)+P(vs—va) — eon) 


With this expression for n(p), one obtains (refer to (7-52b), in which one 


takes v, = 0) 


1 : 
paVn = 75 i d® ppn(p). (7-55) 


We now make use of (7-54), put v, = 0, and expand the resulting expres- 
sion in powers of v,,. The term of degree zero does not contribute to the 
integral while the term of degree one gives 


AnB [© pref) 


n— 7 d 5 7T- 
p 3h3 ; (eFe) _ 1)? iY ( 56) 


(check this out; terms of higher degree are not relevant since thermody- 
namic equilibrium requires v, — 0). In the low temperature limit (z less 
than and away from unity; close to 7), the elementary excitations inter- 
act with one another and the chemical potential of the gas of excitations 
attains a non-zero value as seen in sec. 7.3.3.1 in the case of an weakly 


interacting Bose gas), only the phonon part of the spectrum contributes 
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to the integral in (7-56), and one obtains [113] 


~ 2n*(kpT')* 


pa ee 7-57 
p 45h3e° ( ) 


(check this out; the integral in (7-56) can be worked out by first integrat- 


ing by parts). 


Knowing the value of p,, one can work out p, from the measured value 
of the total density p (refer to eq. (7-52a)). The above formula for p, re- 
mains valid for T sufficiently away from 7) since, close to 7), the mutual 
interaction of the elementary excitations begins to acquire relevance. The 
observed variation of “ and @ with temperature is depicted schematically 
in fig. 7-6 where one finds that the former increases from O to 1 and the 
latter correspondingly decreases from | to O as T increases from 0 to 7). 


relative density 


Figure 7-6: Depicting the variation of relative densities of the normal and superfluid 
components and | of He-II as T varies from 0 to 7) (schematic); at T ~ 0, the 


superfluid component dominates (p, goes to zero like T+) while, for T — T, ps goes to 
zero (the sum of the two relative densities is unity). 
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The expression for n(p) given by (7-54) can be made use of in calculat- 
ing the internal energy, entropy, and other thermodynamic properties of 
the normal component of Helium II, treating it as an ideal gas of the el- 
ementary excitations. All these follow from the expression of the grand 
potential 2 (which reduces to the free energy F', since the chemical po- 


tential of the ideal Bose gas of excitations is zero) 


USF) = md feria +n(p)), (7-58) 


where n(p) is given by (7-54) (check this out; refer to sec. 3.3.3.1, and con- 
sider single-particle states with well-defined momenta (p)), and where the 
contribution of the ground state energy is excluded. As seen from (7-54) 
and (7-58), the thermodynamic functions depend on the relative velocity 
w = v; — vy, and the relevant expressions can be expanded in ascend- 
ing powers of w. In the following we state the expressions in the limit 
w — 0, while corrections involving higher powers of w can also be worked 


out [77]. 


In the limit w — 0, the expressions for the thermodynamic functions can 
all be related to integrals of the form [,* f(¢(p))p"dp, where f stands for 
some function of «(p) and r for an exponent depending on the thermody- 
namic property in question. In the following n(p) stands for =7,—. The 
integral in question can further be approximated as a sum of two contri- 
butions — one coming from the phonon part of the excitation spectrum 
and other coming from the roton part. In other words, one can consider 


separately the contributions of the phonon gas (e(p) = cp) and the roton 


gas (e(p) given by (7-50a)). 
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As regards the phonon contribution, the thermodynamic functions are all 
obtained from corresponding results for the weakly interacting Bose gas 
stated in sec. 7.3.3.1. Thus, the internal energy, free energy, and entropy 


of the phonon gas are given, in the low temperature approximation, by 


({113], [77] 
ylphon} ike 
a) ee 

ake rev 1 
Friphon| _ _@ B a p[phon] 

90h3c3 3 / 

OF 2n*kéT®V 
[phon] _ __ - B a” 

ve OT = 4533’ tea) 


all these expressions actually representing the thermodynamic functions 
of the normal component of Helium-II (recall that the superfluid com- 
ponent is free of excitations). The phonon contribution to specific heat 
(T 9) of the normal component has a T°-variation, typical of a gas of non- 
interacting phonons (the same T* dependence is observed in the case of 


the specific heat of a crystalline solid at low temperatures). 


For the roton gas, on the other hand, one can make use of the for- 
mula (7-50a) for e(p) in (7-54), and also of the empirical fact that, for 
the temperature range under consideration (z less than and away from 
unity), the energy gap A is large compared to kp7’. This reduces the distri- 
bution of rotons into one of a Maxwell-Boltzmann type, and makes easier 
the evaluation of the thermodynamic parameters of the ideal roton gas. 


To start with, the mean number of rotons at temperature T is found to be 


V = pa? 2 pkplV __A. 
Ateotl = iz | apex —p(a+ & fe )| BG eFsT, (7-60) 
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The contribution of the roton gas to the free energy, entropy, and the 


specific heat work out to ({113], [77]) 


pret] = per hl, 


A a 
sie = ben + 5), 
A? A 3 
[rot] 2 [rot] ! ! ; 7-61 


Finally, the roton contribution to the density of the normal component 


(p,) is found to be 


protl _ Po Niet] (7-62) 


(check the above results out). 


The roton contribution to the thermodynamic properties of He II remains 
small compared to the phonon contribution at T ~ 0, but slowly gains in 
importance as the temperature is made to increase. The two contribu- 
tions become comparable at T ~ 0.8K, beyond which the roton contribu- 


tion quickly becomes predominant. 


Second sound. 


The two-fluid model of superfluid helium implies the existence of what 
is referred to as second sound. The propagation of sound waves in He- 
II, referred to as first sound, involves the sinusoidal variation of density, 
where the densities of both the normal and the superfluid components 


vary in phase. In contrast, it is possible for a wave to be set up in liquid 
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helium II in which the densities of the two components vary out of phase. 
This differs from first sound in that the total mass density remains con- 
stant, but the entropy density varies sinusoidally. In other words, second 
sound is a mode of propagation of heat, not in accordance with the diffu- 
sion equation but in the form of a wave. While the velocity of first sound 


is given by the expression [113] 


0 
“= (5) (7-63a) 
the second sound velocity is 
TS2 
a4 (7-63b) 
PrCp 


where S and C,, stand for the entropy and the specific heat at constant 


pressure (both pertaining to the same mass, say, m of helium). 


The existence two modes of wave propagation involving density variations 
of the superfluid and normal components of He II, and the expressions 
for their velocities given above, follow from the balance equations of the 
total mass, the momenta of the two components, and the entropy, which 
is carried by only the normal component of He-II (the entropy of the su- 


perfluid component is zero). 


As seen from formula (7-63b), the velocity of second sound is small just 
below 7) and increases to large values as T goes to zero, when one has 


V2 = 


Sls 
oe 
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7.4.4 BE condensation and Superfluidity 


7.4.4.1 BE condensation and sperfluidity: introduction 


BE condensation and superfluidity are both observed as low temperature 
phenomena, and both are caused by the bosonic nature of the particles 
and excitations involved. The question,however, arises as to whether BE 
condensation has a causal role in bringing about the state of superfluid- 
ity. Landau’s phenomenological theory of superfluidity accords no such 
role to BE condensation, while London suggested a more direct link be- 
tween the two phenomena, based on the observation that the superfluid 
transition in liquid helium occurs at a temperature close to (and less 
than) the BE condensation temperature for a hypothetical ideal gas made 


up of helium atoms. 


Experimental and theoretical studies do indicate that the two phenomena 
are closely linked, though the exact nature of this link is still to be clari- 
fied. Thus, there is no simple relation between the superfluid fraction oe 
refer to (7-52a)) and the condensate fraction (see below for an explanation 
of this term) at any specified temperature (below 7) in Helium-II. At this 
point we enter into a brief digression on the criterion for BE condensation 


in interacting systems and the BE order parameter. 


7.4.4.2 Criterion for BE condenstation in interacting systems 


The theory of BE condensation in an ideal gas of bosons, outlined in brief 


in section 3.3.7.2 focuses on the occurrence of a macroscopic population 
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of the single-particle ground state. In the case of a system of interacting 
particles, the single-particle states lose their relevance, though these are 
still seen to be meaningful in Bogoliubov’s perturbative treatment for a 
weakly interacting Bose gas (sec. 7.3.3). Penrose and Onsager [109] (see 
also [113]) generalized the criterion for BE condensation to include the 
case of interacting systems, while still retaining analogy to the ideal gas 
criterion, by referring to the single-particle reduced density operator 61. 
The latter is obtained from the N-particle density operator (specifying a 
given equilibrium state of a system of N number of interacting indistin- 
guisable bosons) 6) by taking a partial trace over states of N — 1 number 


of particles and multiplying with NV. 


If V denotes the state space of a single particle, in which a basis is made 
up of vectors |i) (¢ = 1,2,---) then the states of the system are symmetrized 
vectors in the N-fold direct product space of N copies of VY, in which a 
basis is made up of product vectors |iji2---iy),(i, = 1,2,--- for each k(= 
1,2,---,N)). The partial trace (taken together with a factor N) referred to 


above is then defined by the formula 


(i1|61)) = N S0 + So inig in lew |itig in). (7-64) 
12 in 


In the special case of an ideal gas in equilibrium described by the canon- 
ical ensemble, the eigenstates of o; turn out to be the single-particle sta- 
tionary states and the corresponding eigenvalues are the mean numbers 
of particles in these states. In other words, the criterion for BE con- 
densation for the ideal gas can be stated by saying that the maximum 


eigenvalue (nmax) Of 4, is to be of the order of (and less than) NV. 
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This last statement actually constitutes the generalization of the criterion 
for BE condensation for an interacting system: if nnax be the largest eigen- 


value of o, then a necessary and sufficient condition for BE condensation 


is that, in the thermodynamic limit, ““ (evaluated for an equilibrium 
state) is a positive number less than unity (or, more generally, has posi- 


tive upper and lower bounds, both less than unity). 


The validity of this statement can be verified directly for a homogeneous 
system subject to periodic boundary conditions, in which case <a; is seen 
to be diagonal in the single-particle momentum basis, and n,,,, turns out 
to be the mean number of particles occupying the single particle ground 
state. This, for instance, is the case of the weakly interacting Bose gas 
considered in sec. 7.3.3. More generally, it applies to non-homogeneous 


and strongly interacting systems too. 


The criterion mentioned above may be stated in terms of the one-body 
off-diagonal density characterizing a given many-body state, defined in 
terms of the field operator ¢(r) (denoted by G,(r) in sec. 7.2; the spin 


index o is redundant in the present context of spinless particles) as 


nO(r,r') = (Ht (ryh(e")), (7-65a) 


where the state under consideration may be a pure or a mixed one. The 
diagonal density (more commonly referred to as the one-body density or, 


simply, the density) for the state under consideration is related to n“ as 


n(r) = n(x, x). (7-65b) 
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If the system under consideration is in a (symmetrized) pure 
state with wave function V(r), r2,--- ,r,) then the average in (7-65a) 


is given by 


nr, rj=N i d@ ry ++. dry ¥*(r, To,°°+ , Un) V(r’, re,-++ , In): 


(7-66) 


If, on the other hand the state in question be a mixture of a 
number of orthonormal pure states with given set of weights, 
then the average will be a weighted sum of contributions com- 


ing from the pure state, with the specified set of weights. 


The one-body off-diagonal density n“)(r,r’) determines the momentum 
distribution n(p) ([113]) among the particles of the system in the given 


many-body state as 


1 1 1 ip-s 
n(p) = ‘a [eras nO(R 4 58 R = 58) in (7-67) 


In the case of a homogeneous isotropic system, n“)(r,r’) takes the form 
ns) (s = |r —r’|). In the absence of BE condensation, n“)(s) > 0, asa 
result of which n(p) is a smooth function for p = 0. However, for a system 
below the BE transition temperature, n‘)(s) attains a finite value at large 


S: 


[in the case of BE condensation :]_ n‘?(s) + no at s > 00. (7-68a) 


At the same time, by (7-67), the momentum distribution acquires a delta- 
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function peak at p = 0: 
n(p) = Nod(p) + n(p), (7-68b) 


where No = noV is a finite number such that, in the thermodynamic limit 
No (NV =total number of particles) goes to a non-zero positive limit less 
than unity. This limit then defines the condensate fraction for the system 
at the temperature under consideration. The fact that BE condensation 
implies and is implied by a non-zero limit of n“(s) is expressed by saying 


that BE condensation involves ‘off-diagonal long range order’ (ODLRO). 


The off-diagonal density n“)(r,r’) can also be obtained in terms of the 


single-particle reduced density operator defined by (7-64): 
nr, er’) = (r]d4 |r’). (7-69) 
The eigenvalue equation of 6, can then be written in the form 


fern) = nidi(r). (7-70) 


The eigenfunctions ¢; are said to constitute the natural single-particle 
basis, and the corresponding eigenvalues n; determine the mean number 
of particles in the natural basis states. The supremum (n,,;) of these 
eigenvalues gives the maximum occupation number among the natural 
single-particle states and features in the criterion for BE condensation 
stated above. In the case of a homogeneous system the natural basis 


states are the single-particle momentum states and, for such a system, 
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BE condensation can thus be described as condensation in the momen- 


tum space. 


7.4.4.3 BE condensation: the order parameter 


Referring to the single-particle wave functions ¢; making up the natural 


basis (refer to (7-70)), one can express the field operator 7)(r) as 

br) = Sd gi(r)a, (7-71) 
where 4a; (resp. al, (i = 1,2,---)) are annihilation (resp. creation) opera- 
tors for the states ¢;. Denoting by ¢ the eigenstate corresponding to the 
largest eigenvalue ny... (= No for a homogeneous system, where the single- 


particle momentum states constitute the natural basis) we can separate 


the contribution of ¢9 in (7-71) and write 
U(r) = do(r)do+ >_> bi(r)au, (7-72) 


where the prime over the summation symbol indicates that the contribu- 
tion of the state ¢o is to be omitted in the sum. At this point, following 
the spirit of the Bogoliubov approximation (refer to sec. 7.3.3) we replace 


a) with the c-number ,/nmax and write 


br) = volr) + D> bi(r)ai, (7-73) 


where wo(r) (= \/Nmax%o) is referred to as the ‘wave function of the conden- 
sate’ (recall that, for a homogeneous system, nnax = No, the number of 


bosons in the single-particle state with momentum 0 which, in this case, 
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defines the condensate). It is this function that plays the role of the order 
parameter in BE condensation. It is zero in the absence of BE conden- 
sation, but acquires a non-zero value as BE condensation sets in. Itis a 


complex quantity characterized by a modulus and a phase: 


Po(r) = |eo(r) |e, (7-74) 


where the phase S(r) acquires a special relevance in superfluidity. Fol- 
lowing the idea of the Bogoliubov approximation one can write, as an 


approximate equality, 


wo(r) = (ab(r)). (7-75) 


1. BE condensation involves the breaking of a gauge symmetry, corre- 
sponding to the choice of a constant phase term (say, a) in S(r). The 
physical consequences of the theory are all independent of this choice, 
and the condensate wave function describing any particular instance 
of condensation in a system corresponds to some particular value of 


Qa. 


2. In the case of a weakly interacting Bose gas, an approximate descrip- 
tion of the condensate wave function is provided by the Gross-Pitaevskii 
equation, which gives important insights into the relation between BE 


condensation and superfluidity. 
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7.4.4.4 BE condensation and superfluidity: a brief overview 


BE condensation appears to be a necessary condition in superfluidity. 
The relation between the two phenomena, though, is not a simple one 
and, referring to the particular case of superfluidity of He-4, the conden- 
sate fraction at T ~ 0 has been theoretically and experimentally estimated 
to be only ~ 10%. However, this is not the whole story since the conden- 
sate is, in some sense, locked on to the superfluid component of He II, 
because the velocity field of the latter is determined by the phase of the 


condensate wave function, introduced in (7-75), as 


Vir t)= “VS(r, i); (7-76) 


The relation between the phase of the condensate wave function and the 
superfluid velocity is obtained from the Galilean transformation property of 
the field operator 7, and the relation (7-75). Starting from the expression 
of the condensate wave function in the rest frame of the superfluid and 
going over to a frame in which the superfluid is moving with a velocity v, 
one obtains a space- and time-dependent phase factor from which the for- 
mula (7-76) follows (refer to [113]). The formula holds for a velocity field 


that varies sufficiently slowly in space and time. 


This tells us that the velocity of the superfluid component is irrotational 
(V x v = 0) as in the case of an ideal fluid. However, vortices can still 
be generated in the superfluid in the form of vortex tubes as indicated in 


sec. 7.4.5 below. 


In addition to the phase of the condensate determining the velocity field 
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of the superfluid component, the intimate link between BE condensation 
and superfluidity is further underlined by the fact that, as in the case of 
the weakly interacting Bose gas, the existence of the condensate provides 
the necessary microcopic basis for the form of the excitation spectrum of 


liquid He-4 postulated by Landau. 


The investigations into the phenomena of BE condensation and super- 
fluidity have gained greatly in depth and scope with studies on cold and 
trapped assemblies of gas atoms or molecules, since such studies have 
furnished real-life examples of interacting systems undergoing BE con- 
densation and the superfluid transition. In contrast to liquid He-4, the 
cooled gases constitute weakly interacting systems and provide rich pos- 
sibilities of studying the characteristics of BE condensation and superflu- 
idity in the gaseous phase, including the relation between the two. Spin- 
polarized hydrogen offers very good possibility for such studies, while 
gaseous assemblies of alkali metal atoms have also been studied widely, 
producing remarkable new observations. Finally, the superfluidity of He 
3, observed in the milli-kelvin range of temperatures led to the study 
of BE condensation and superfluidity in fermionic systems where pairs 
of fermions combine, either in a molecular formation or in loose bound 
structure (analogous to Cooper pairs considered in the BCS theory of 
superconductivity, see sec. 7.6.4). In the latter case involving fermions 
paired in loose bound configurations, the study of BEC-BCS cross-over 


( [113]) has led to new insights . 
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7.4.5 Quantum vortices 


The idea of quantized vortices was put forward from theoretical consid- 
erations by Onsager and was subsequently developed by Feynman, and 
these vortices were observed in experimental studies by Hall and Vinen. 
Since the superfluid component of He-II has zero viscosity, corresponding 
to an irrotational velocity field (refer to (7-76)), vortices cannot be set up 
in a region of space filled up by it. However, if the region is not simply 
connected, i.e., includes a hollow not filled up by the superfluid com- 
ponent, then the line integral of velocity along a closed curve encircling 
the hollow (i.e., the circulation of the velocity along the closed curve) can 
be non-zero. Thus, a vortex can be set up in the superfluid component 
around a cylindrical hollow (a region devoid of the suprfluid component, 
but one that may be filled up by the normal component) that can be as 
thin as possible, the latter being referred to as a vortex line. Thus, it 
is possible for the superfluid component to circulate around the normal 


component. 


Considering a closed curve [ around a hollow as in fig. 7-7, and making 


use of (7-76), the circulation (or vorticity) around [ seen to be 


¢ vidl=— ds. (7-77) 
(Lr) ay) 


Here, the closed integral on the right hand side need not be zero since S 
is a phase and the contour I cannot be shrunk continuously to a point. 


Instead, its value can be any integral multiple of 27, which tells us that 
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the vorticity is quantized: 


¢ pedis Gas Bocca (7-78) 
(T) ue 


T 


Figure 7-7: Illustrating the appearance of a non-zero circulation in a multiply con- 
nected region where a quantized vorticity of the superfluid component of liquid He-II 
can appear around a hollow, the latter being a region not occupied by the superfluid 
component; I is a closed curve encircling the hollow. 


Quantized vortices can be produced in non-homogeneous liquid He-II, 
when these can lead to macroscopic effects. Indeed, these are so ubiq- 
uitous below the \-transition, that they appear to be generated sponta- 
neously as excitations in He-II. It is because of these excitations that the 
critical velocity below which superfluidity is observed in liquid He-4 turns 
out to be much lower than the value predicted by the excitation spectrum 


of phonons and rotons (refer to 7.4.2). 


7.5 Interacting fermion systems 


7.5.1 Interacting fermions: a brief overview 


Ideal gas systems made up of non-interacting fermions have been con- 


sidered in section 3.3.6. We will now look at interacting fermion systems 
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at low temperatures where quantum effects predominate. The picture 
here is more varied than that for interacting bosons since, in the pres- 
ence of attractive interactions (even extremely weak ones), fermions can 
pair together to form molecules or dimers that behave like bosons un- 
der particle interchange, and a completely new type of behavior emerges. 
This, for instance, is what happens in the case of liquid He-3, where 
molecular dimers obeying Bose statistics are formed out of He-3 atoms 
in virtue of a weak and attractive 2-body interaction, and the assembly 
of dimers then undergoes a Bose condensation in the millikelvin tem- 
perature range. Under a different set of conditions, He-3 atoms pair up 
to form configurations analogous to Cooper pairs in the BCS theory (see 
sec. 7.6.4) where, however, the attraction leading to the pair formation 
is in the nature of an effective many-body force. Assemblies of fermions 
under low temperature conditions in which such bound configurations 
do not arise correspond, in a certain sense to be indicated below, to the 
ideal Fermi gas. These are referred to as ‘normal’ fermi systems. Here one 
may have either gaseous Fermi systems with weak interactions between 
the fermions, or a condensed system with strong interactions. Liquid He- 
3 above the superfluid transition temperature is a system of the latter 
kind, the only known one in nature. A highly insightful and fertile theory 
describing the behavior of He-3 was put forward by Landau. Referred 
to as the theory of Fermi liquids, this theory, though phenomenological 
in nature, subsequently developed into a branch of many-body physics 
of great relevance. The basic ideas underlying this theory will be our 


principal concern here and will be outlined in sec. 7.5.2 below. 
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Theoretical and experimental work on trapped ultracold Fermi gases has 
now produced a huge literature and is a highly active branch of low tem- 


perature physics, of which an excellent account is to be found in [113]. 


A cooled and trapped assembly of fermions under a considerably wide 
range of experimental conditions constitutes a dilute interacting Fermi 
gas, for which the interaction can be effectively described in terms of a 
single parameter, namely, the s-wave scattering length a. However, unlike 
the dilute Bose gas where one requires a to be positive to ensure stability, 
one can consider both negative and positive values of a in the case of a 
Fermi gas. Experimental techniques are available where one can tune 
this parameter from negative to positive values, where there takes place 
a cross-over from attractive to repulsive interactions. However, there re- 
mains the possibility of dimer formation, as mentioned above, even for 
positive values of a. In the following, we will not consider the possibility 
of the formation of composite bosons (excepting in the case of the electron 
gas considered in sec. 7.6). Referring in passing to the case of a weakly 
repulsive dilute Fermi gas, there exists a perturbative approach for suf- 
ficiently small values of a to develop the statistical mechanics of such a 
system where one obtains small corrections over results pertaining to the 
ideal gas of fermions. For instance, the perturbed ground state energy of 


a gas made of N particls turns out to be given by the expression 


By = 2Nep[1 + hea + O((kra)’)], (7-79a) 


where er is the Fermi energy of the unperturbed system given in (3-75), 


in terms of which the unperturbed ground state energy is 2?.Nep (refer to 
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eq. (3-76a); note the slight difference in notation; the particle number 
N and the mean number of particles VV defined for the canonical and 
the grand canonical ensembles respectively, agree in the thermodynamic 


limit). The Fermi momentum (in units of h) is 


/2Mer 
h 


= (372n)3 = 


kp = 


i (7-79b) 


(check this out). 


One observes that the small parameter in the perturbative expansion is 


an3, as in the case of the weakly interacting Bose gas (sec. 7.3.2). 


In the following section (sec. 7.5.2) we outline, following [112], [5], the 
basic idea underlying Landau’s theory of ‘normal’ Fermi liquids where 
a normal Fermi liquid is an assembly of interacting fermions that do 
not form bound configurations (see below for a more precise definition), 
there being no restriction on the strength of the interaction, as a result 
of which the Fermions may be in a condensed, liquid phase. As already 
mentioned, liquid He-3 above its superfluid transition is a paradigmatic 
example. Landau’s theory constitutes a landmark in low temperature 


many-body physics. 


We will confine our attention to neutral Fermi liquids. The pool of con- 
duction electrons in a metal constitute an instance of a charged Fermi 


liquid. 
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7.5.2 Normal Fermi liquids 
7.5.2.1 Landau’s theory of Fermi liquids: basic ideas 


The theory of normal Fermi liquids is developed in close correspondence 
with that of the ideal Fermi gas (section 3.3.6.2). Energy eigenstates of 
the latter are described in terms of a distribution {n,} in all the single- 
particle momentum (and spin) eigenstates (we adopt the convention that 
when p is used as a sub-index in identifying a state, it includes the spin 
index in addition to the momentum). The ground state of the ideal system 
is described by saying that all states with momentum |p| < pp (the Fermi 
momentum, corresponding to the Fermi energy ef) are occupied while 
those with |p| > pp are unoccupied. The spherical surface in the momen- 
tum space defined by |p| = pr constitutes the Fermi surface. We denote 
the occupation numbers for the ground state by ny where p ranges over 
the single-particle states. The distribution {nO} defines the quilibrium 


state of the system at T = 0. 


Excitations in the ideal system are described by specifying the collection 
{ig} = {np—nw) }. The elementary excitations in this case are the particles 
and holes residing (respectively) outside and inside the Fermi surface in 


the momentum space. 


We now consider an interacting Fermi liquid (the ‘real’ system, in con- 
trast to the ‘ideal’ one referred to above). We imagine that the real system 
is generated from the ideal system by adiabatically switching on the in- 
teraction. This being an infinitely slow process, one can assume that 


eigenstates of the ideal system are gradually transformed into those of 
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the real system, i.e., there exists a one-to-one correspondence between 
the two sets of eigenstates. A system for which this assumption is valid 
is taken as the definition of a normal Fermi liquid. In particular, the 
ground state of the real system is generated from the distribution {nO} 


of the ideal system. 


Conclusions derived from this assumption of one-to-one correspondence 
are found to be valid for liquid He-3 above the millikelvin temperature 
range, which therefore constitutes an instance of a normal Fermi liquid. 
Cooled and trapped assemblies of fermion atoms under appropriate cir- 


cumstances constitute other instances of normal Fermi liquids. 


The assumption that the real ground state is generated from the ideal 
ground state in the process of adiabatic switching on, is valid in the case 
of an isotropic system such as a liquid for which the Fermi surface is 
spherical in shape. For a non-isotropic system the real ground state may 
be generated from an excited state of the ideal system. We will assume 


that the former situation applies and the Fermi surface is spherical. 


Imagine an ideal Fermi gas in which a single particle with momentum 
p (and spin o; the spin will be left implied in the following paragraphs, 
and will be considered explicitly where the context makes it necessary) is 
added over the ground state, following which an interaction is switched 
on adiabatically. Assuming that the interaction conserves momentum, 
we will end up with an eigenstate of the real fluid corresponding to the 
‘ground state plus one p-particle’ eigenstate of the ideal gas. This con- 


stitutes a state with one elementary excitation of momentum p over the 
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ground state of the real system, referred to as a quasi-particle. In other 
words, a particle with momentum p of the ideal system (see above) corre- 
sponds to a quasi-particle with momentum p of the real system. Quasi- 
holes of the real system are similarly defined. An eigenstate of the real 
system is then made up from a number of quasi-partcles and quasi-holes 
with a certain momentum distribution {n,} in the various momentum 
states (I repeat that spin states are left implied). In the following,the term 


‘quasi-particle’ will be used inclusively to refer to quasi-holes as well. 


We denote the ground state energies of the ideal and the real systems by 
EY and E°) respectively. For the ideal system, the energy of the ground 
state plus a single p-particle (resp. p-hole) is BO + a (resp. EO = = ; 
Denoting the energy of the corresponding eigenstate of the real system 
by E© +, we identify «, as the energy of an elementary excitation (a 


‘quasi-particle’ in the inclusive sense mentioned above) of momentum p, 


considered independently of other excitations in the system. 


More generally, an eigenstate of the real system can be described in terms 
of a distribution {n,} of quasi-particles of various momenta p. Landau’s 
theory applies to sufficiently low temperatures such that the relevant 
momenta are close to the Fermi momentum pr given by (refer to for- 


mula (3-75); we take g = 2, corresponding to spin-5 fermions) 
Ne (7-80) 


The energy of a quasi-particle on the Fermi surface will be denoted by ep, 


and represents the chemical potential (.) of the system at T = 0. 
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In accordance with what has been stated above, the thermodynamic 
properties of the real system depend, not on the distributions {n,} (recall 
that a momentum distribution of the quasi-particles defines a microstate 
of the system; the same notation will be used to denote an equilibrium 
state as well, where the intended meaning is to be read from the con- 
text) themselves but on the deviations {én,} from the distribution {nO} 
specifying the ground state, where these deviations are non-zero only in 
a thin shell of the momentum space around the Fermi surface Sp. The 
sum }/,, 6Np gives the total number of elementary excitations. This num- 
ber can fluctuate, depending on the chemical potential : of the collection 
of the elementary excitations. The Fermi liquid is then described appro- 
priately in terms of the grand canonical ensemble. At T ~ 0, the grand 
potential F (referred to as the ‘free energy in [112]; we will follow this 
terminology), relative to its value (Fo) in the ground state is given, naively 


speaking, by 
F-Fyx ) (ep — ping, (7-8 1a) 
p 


(however, see below) where the energy F of the elementary excitations, 
considered independently of one another, and the mean number of the 


excitations are given by 


E=)S ein, N= S/n. (7-81b) 
p p 


The distribution np (as also dnp) is typically a highly discontinuous function 


of p. It is therefore necessary to consider a smoothed distribution where n, 
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and dn, are defined by averaging over neighbouring momentum states in an 


appropriate manner. 


It may be mentioned that F (and F, Fp of formula (7-82a) below) refers to 
the entire system of volume V. Since, in the thermodynamic limit V > oo, 
a summation over p carries a factor V, «, is of the order V°, i.e., O(1) (as 


it should be). 


However, the approximation (7-81a) is not quite consistent since a feature 
of central relevance in Landau’s theory is that the elementary excitations 
are not independent but necessarily interact with one another (depending 
on the actual interaction between the particles constituting the Fermi 
liquid) and that this interaction has to be taken into account in order to 


derive a consistent approximation to F' — Fy, which now reads 


F-—Fo= Da (6oe — p)dNpo + oo fap eONgaal loa (7-82a) 
where we have explicitly mentioned the spin-indices in the various terms, 
and where the second term on the right hand side gives the contribution 
of the interaction among the quasi-particles to the free energy relative to 
the ground state at T = 0. In this expression, f,,,p., denotes the energy 
of a chosen quasi-particle in state po, in the wake of a quasi-particle in 
state p’o’. The factor of half comes in when the mutual energy of a pair of 


quasi-particles is under consideration. The expression 


épo = Epo + >. fpo,p'o'SNpie!s (7-82b) 


p’o’ 
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represents the energy of a quasi-particle in state po in the presence of all 
other quasi-particles in a given distribution. It is referred to as the local 
energy of the quasi-particle. Evidently, fp, p'0, as defined above is O(V~') 


in the thermodynamic limit. 


1. In the expression (7-82b) for the local energy, dnp (in the abbreviated 
notation where the sub-index p is understood to include the spin vari- 
able o) stands for the excess over the distribution n) for the ground 
state. The latter being the equilibrium state at T = 0, one can write, 


for an equilibrium state at temperature T, 


dnp — dn) (T, Ht) ~~ bn) (0, }), (7-82c) 


where now the super-index (0) is meant to denote an equilibrium state 
(which is synonymous with the ground state at T = 0). The chemical 
potential for a given value of the mean number of particles of the 
system varies with temperature through a correction term O(T”) and 
can be assumed to remain unchanged for sufficiently small values of 
T. 


2. Though the first term on the right hand side of (7-82a) appears to 
represent the leading order in an approximation scheme in which the 
second term is of a higher order of smallness, both the terms are ac- 
tually of the same order in the expression for the deviation of the free 
energy from its ground state value. At sufficiently low temperatures, 
one can assume that dn, is non-zero in a small region of width, say 6, 
around the Fermi sphere, in which case both the terms are found to be 
of the order of 5? [112] . Thus, it would be inconsistent to approximate 
F — Fy by means of the first term alone, and one has to necessar- 


ily consider of the interaction among the quasi-particles even in the 
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leading order of approximation in the theory. More precisely, the two 
terms on the right hand side of (7-82a) together constitute the leading 


approximation in an expansion in terms of the small paremeter ([112]) 


a= 


5 
eae (7-82d) 


3. Starting from a given distribution for the ideal system, as one arrives 
at the corresponding distribution of the real system by the process of 
adiabatic switching on of the interaction, one can imagine that a par- 
ticle of the ideal system gradually influences more and more particles 
around it, and finally ends up creating a region of influence around 
itself, involving other particles in this region. This is how a quasi- 
particle is generated and the process can be picturesquely described 
as the ‘dressing’ of a bare particle, where the region of influence defin- 


ing a quasi-particle can be looked upon as a wake carried by the latter. 


It is apparent that f,,,p., can be defined as the second variational deriva- 


tive of /' — Fo with respect to {dn,,}, which implies the symmetry relation 


fpop'c’ = fp'e' pa: (7-82e) 


The energy F and the free energy F are functionals of the smoothed dis- 
tribution n, (or, equivalently, of dn,). Generally speaking, an arbitrarily 
specified distribution n, describes a non-equilibrium state of the system. 
On the other hand, a given temperature and chemical potential defines an 


equilibrium state which has been denoted by the super-index (0) in (7-82c). 
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At times, the super-index (0) is omitted while referring to an equilibrium 


state, and one has to refer to the context to get the intended meaning. 


Further symmetry relations follow if there is no applied magnetic field 
(i.e., time reversal invariance holds) and, moreover, the Fermi surface 
is invariant under reflection in the momentum space. The coefficients 
fpo.p'o’ then depends on the spin variables only through their relative ori- 
entation [112], i.e, out of the four possible combinations for co’ only two 
are relevant, namely, the parallel and anti-parallel ones (i.e., the two pos- 
sible parallel combinations lead to identical values of f for given values 
of p, p’, and so do the two possible anti-parallel combinations. It is con- 


venient to express these two values as 


[o,o’ parallel :] fps pio = ae + fe 


[o, 0’ anti — parallel :] fos pio’ =f) = hee (7-83) 


In these expressions, fa ee are referred to respectively as the spin sym- 


metric and spin anti-symmetric parts of the quasi-particle interaction. 


If, moreover, the system under consideration is isotropic (Such as a liquid 
in the absence of an external field) one can further simplify by noting 
that, for p, p’ close to the Fermi surface, the interaction depends only on 
the angle 6 between p and p’, and hence can be expanded in a series of 
Legendre polynomials: 


fe » fe P(cos 8). (7-84) 


dogs _ 
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Finally, it turns out to be convenient to normalize in terms of the density 
of states vp at the Fermi surface (where 1 is related to the effective mass 
m* of the quasi-particles, see below) and introduce the reduced strengths 
of interaction F//*! as 


mM ppV 


FP a apff = a 


feel, (7-85) 


The parameters F*/*! 


completely determine the interaction between quasi- 
particles located close to the Fermi surface. All the equilibrium and non- 
equilibrium properties (the latter in the linear response regime; refer to 
chapter 8) of a normal Fermi liquid can be expressed in terms of these 
parameters. For most practical purposes, one needs only the parameters 
m*, and the low-! coefficients FS’, F'*/*! to describe the behavior of the 
system. These appear as phenomenological parameters in Landau’s the- 
ory of Fermi liquids. In providing us with a complete description in terms 


of these few parameters, Landau’s theory succeeds in achieving a great 


simplification in the understanding of interacting Fermi systems. 


The validity of Landau’s theory is conditional upon the quasi-particles being 
stable entities. In reality, the quasi-particles have a finite lifetime because 
of decay processes. For instance, a quasi-particle lying outside the Fermi 
sphere may decay into two quasi-particles and one quasi-hole. However, for 
a quasi-particle of energy (6) sufficiently close to the Fermi energy (ep), its 
lifetime turns out to ~ (€ — er)~? and thus, at sufficiently low temperatures, 
at which only those quasi-particles are excited that have energies close to 


the Fermi energy, Landau’s theory rests on sound footing. 
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One of the basic parameters of relevance in Landau’s theory is the ef- 
fective mass (m*) of the quasi-particles, where the effective mass differs 
from the bare mass of the fermions due to the process of ‘dressing’ result- 
ing from the interaction. Recalling that, at sufficiently low temperatures, 
one needs consider only momenta p of quasi-particles close to the Fermi 


surface Sp, the relevant quasi-particle energies can be approximated as 


Ep © €— + Vpéplsp - (P — Pr)- (7-86a) 


where the gradient is evaluated on the Fermi surface and the sub-index 


F is used to denote values evaluated on Sp. Defining 


Vp = Vp; (7-86b) 


we can write (7-86a) as 


ep © ep + Ve: (Pp — Pr). (7-86c) 


In the case of a spherical Fermi surface, vp is parallel to pp, and the 


effective mass m* is defined by writing 


Pr. (7-86d) 


In the analogous case of the ideal Fermi gas, one has e, = a and, close to 


the Fermi surface, 


Pp — Pr). (7-87) 
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This once again underlines the close correspondence between the ideal 
Fermi gas and the Fermi liquid at sufficiently low temperatures, where one 


has to simply replace the particle mass m with m*. 


Noting that a quasi-particle can be looked upon as a bare particle with an 
appropriate ‘dressing’, it is apparent why much of the physics of a Fermi 
liquid at low temperatures can be captured by the replacement m — m*, 
even when the Fermi particles interact strongly with one another. One 
can interpret P£ as a group velocity of a quasi-particle, this being the 


velocity of the wake associated with the latter. 


In the more general situation where the Fermi surface is not necessarily a 
spherical one, the relevant parameter to use instead of m* is the density of 


states at the Fermi surface. One defines the density of states at energy « as 


v(e) = S$ d(e — ep). (7-88) 


At sufficiently low temperatures, physical properties of the Fermi liquid de- 
pend on the density of states yp on the Fermi surface which depends, in 


general on the direction of pp. In the case of the ideal Fermi gas one has the 


relation vp = “55V (check this formula out). For an isotropic Fermi liquid, 


the corresponding formula is obtained by the replacement m — m* (in this 


context, refer back to (7-85), where the said formula has been made use of). 


Before closing this section it may be mentioned that, in the formula (7-86c) 
(with vp given by (7-86d)), «, is the energy of a quasi-particle considered 
independently of the other quasi-particles in a distribution, while m* rep- 


resents the effective mass of the quasi-particle in interaction with other 
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quasi-particles around it. This is an inconsistency that shows up in a 
violation of Galilean invariance of the theory. Galilean invariance is re- 
stored when one adds to «, the interaction energy of the quasi-particle in 
question with other quasi-particles in the system. More specifically, for 
a homogeneous system in equilibrium, Galilean invariance requires that 
the following relation be satisfied (refer to [5)): 

net Fé 


=] at at (7-89a) 


As seen from this formula, the positivity of the effective mass m* requires 


that 
pols 28. (7-89b) 


This inequality is also obtained as part of a set of conditions pertaining 
to the thermodynamic stability of the Fermi liquid, as we mention below 


(refer to (7-108)). 


The derivation of the equilibrium and non-equilibrium properties from 
Landau’s theory (refer to [112], [5]), though relatively straightforward, is 
outside the scope of this book, where I focus only on the basic concepts 
defining the theory. However, I state below (sec. 7.5.2.2) a number of use- 
ful results for a few static properties of a Fermi liquid so as to illustrate 
the usefulness of Landau’s theory. This will be followed, in sec. 7.5.2.3 
by a few sketchy remarks by way of introduction to non-equilibrium phe- 
nomena and the transport equation in Fermi liquids, which is a vast 


subject in itself and requires considerations quite outside the scope of 
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the present book. Basic ideas in the Boltzmann equation and in linear 
response theory, both pertaining to non-equilibrium phenomena in gen- 
eral, will be outlined in chapter 8. Though closely related to these two 
approaches, the theory of non-equilibrium phenomena in Fermi liquids 
involves a number of special considerations, and can be found in [112] 


[5], [139]. 


7.5.2.2 Fermi liquids: equilibrium properties 


The basic formula to start from is the distribution (n® (T, w)) of the quasi- 
particles in an equilibrium state defined by specified values of T, 1, which 


reads 


1 
exp (Fr) +1 


nT, i= (7-90) 


which resembles the Fermi-Dirac distribution formula for an ideal gas 
where, however, €, represents the energy of the dressed (i.e., interacting) 
quasi-particle in state p. At sufficiently low temperatures this can be re- 
placed with the bare quasi-particle energy ¢,. Formula (7-90) underlines 
the correspondence between the states of an ideal Fermi gas and those of 
a normal Fermi liquid obtained by the process of adiabatic switching on 


of the interaction between the fermions. 


With this basic formula in place, one can work out the leading approx- 


imations to the entropy, free energy, and the specific heat of the Fermi 
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liquid at low temperatures. Continuing to follow [112], [5], one obtains 


c= vek pl. (7-91a) 
12 

F-Kh= GuelkeT)”, (7-91b) 
12 

Cy = svekeT. (7-9 1c) 


In the above formulae, vy, stands for the density of states at the Fermi 
surface, related to the effective mass as (refer back to (7-85), where the 
relation between the two has been invoked) 


mv pp 
TAs ? 


(7-92) 


Vp = 


and Fy stands for the free energy at T = 0 (i.e., Fy = E© — N). Eq. (7-91c) 
for the specific heat is made use of for the determination of the effective 
mass. Measurements on liquid He-3 give a value m* = 3.1m at a pressure 
of 0.3atm. (approx), from which one obtains, by the use of (7-89a), Fs ~ 
6.3. This tells us that liquid He-3 can indeed be described as a strongly 


interacting system of fermions. 


As for the compressibility k = a (the isothermal and adiabatic com- 


pressibilities turn out to be the same in the leading approximation at 


low temperatures), one can make use of standard thermodynamic rela- 


tions to express this in the form « = ;5($"),, (n = 7), from which the 
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compressibility works out to 


l mm" pr 
a= eae + Fe ( ) 


When compared with the compressibility of an ideal Fermi gas, made up 
of fermions of mass m*, and having a particle density n, the above expres- 


sion differs by a multiplicative factor of —~,. From measured values of 


1+Fo 
compressibility and velocity of sound (see below) one finds that Fe has a 
value of the order of 10 for liquid He-3 at low pressures, confirming once 


again that it is a system made up of strongly interacting Fermions. 


The compressibility is related with the velocity of sound (c) as 


c= =( 22a (7-94) 


On making use of the expression (7-93) for the compressibility, one ob- 


tains 
F s]\7i 
c= gees Cail (7-95) 
(check this out). 


Sound propagation is a non-equilibrium phenomenon where there is a local 
deviation from equilibrium followed by a process of relaxation, and the two 
alternate in space and time. Physically relevant quantities characterizing 
the response of a system to small deviations from equilibrium can be all 
related to equilibrium expectation values of products of appropriate observ- 


able quantities pertaining to the system, as explained in chapter 8. Sound 
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propagation in a Fermi liquid is briefly discussed in the background of the 


relevant transport equation in sec. 7.5.2.3. 


One observes that a necessary condition to be satisfied by F&, analogous 


to (7-89b), is 
Aes 1, (7-96) 


which, if violated, would lead to an instability of the system since an imag- 
inary value of the velocity of sound would imply an exponential buildup of 
density fluctuations. Once again, this condition is part of a set of stability 


conditions (see (7-108)) involving FB, FP for 1 =0,1,2,---. 


One can also calculate the spin susceptibility of the Fermi liquid, which 
measures its static response to a small external magnetic field H. The 
external field modifies the energy of each spin component (oc = +3) in op- 
posite directions by equal amounts, which causes a displacement of the 
two corresponding Fermi surfaces so as to equalize the chemical poten- 
tial of the two components. The chemical potential at equilibrium in the 
presence of the field differs from that without the field by a term of the 
second degree in H, which can be ignored. The distribution function in 
the presence of the field is then obtained from (7-90) by replacing €, — 


on the right hand side with €, —  — gugoH, where pup = st is the Bohr 


magneton and g stands for the gyromagnetic ratio. The spin susceptibility 
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then works out to (see [112]; we assume a gyromagnetic ratio g = 2) 


* 2 
™m Pr UB 
= : 7-97 
he 14 Fe ( ) 
which involves the antisymmetric coefficient Fel and implies yet another 


necessary condition for stability, namely 
pe Ss 1, (7-98) 


The violation of this condition leads to instability against long wavelength 


(low frequency) fluctuations of the magnetic moment. 


7.5.2.3 Fermi liquids: non-equilibrium phenomena and the trans- 


port equation 


The question of instability brings up the issue of non-equilibrium phe- 
nomena in a Fermi liquid under small deviations from equilibrium con- 
figurations. The quantity of basic relevance in the description of such 
phenomena is the deviation of the distribution function n,(r,t) from the 
equilibrium distribution where, generally speaking, the perturbed distri- 
bution function, in addition to being time dependent, varies in space as 
well (recall that the sub-index p is meant to include the spin index o 
in the interest of simplicity of presentation; the spin index is mentioned 
explicitly as necessary). However, in referring to n,(r,t) for a quantum 
mechanical system one comes up against the requirement of the uncer- 


tainty principle that restricts the simultaneous specification of p and r. 


This restriction, however, is not prohibitive when one considers macro- 
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scopic phenomena in a space- and time scales large compared to scales 

in the atomic domain. More specifically, considering a typical Fourier 
0); 2 F sate 

component of én, = np(r,t) — no, i.e., assuming a space-time variation of 


the form 
ltt) = n) + dnp(a, wear“), (7-99) 


characterized by a wave vector q and frequency w, one cannot consider 
arbitrary large values of gq and w. At sufficiently low temperatures, a 


distribution function of the form (7-99) makes sense only if the conditions 
hiie-c< kel, hu << kel, (7-100) 


are satisfied. The space-time variation of the distribution function is then 
sufficiently smooth corresponding, essentially, to a classical description. 
Landau’s theory, however, works under less prohibitive restrictions, of 


the form 
igure << fly Hay << (7-101) 


which enables one to work in a semi-classical setting where dn,(q, w) now 
relates to the Wigner distribution describing the probability amplitude of 
having a pair made up of a quasi-particle and a quasi-hole at p + sha, p- 
shq. With this understanding, we continue to refer to the distribution on, 


so as to get at the basic ideas involved. 


Starting with the distribution function 6n,(r,t), one sets up the transport 
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equation describing its rate of change in the same spirit as in the case of 
the Boltzmann equation to be taken up in chapter 8. For this, dn,(r,t) is 
to be interpreted as the spatial density (i.e., number per unit volume) at 
the point r at time t. Moreover, one assumes that the interaction between 
quasi-particles (now expressed in terms of space dependent coefficients 
fpp’(¥, 2’)) is governed by short range forces between the fermions, in which 
case the energy of the system of quasi-particles (which is a functional of 
the distribution dn,(r,t)) can be expressed in the form (the reference to 


time t is left implied) 

E=EO+ ; d® rdE(r), (7-102a) 
where the local energy density is given by 

1 
5E(r) = 5 > epdnp(r) + 5 Ds fop/ONp(r) dnp (r), (7-102b) 
p P,p 

with the interaction coefficients f,,, defined as 

hop’ = / dr" fop(r, 1’), (7-102c) 


(this is independent of r for a translationally invariant system). 


In other words, fp» appear as the strength of interaction between quasi- 
particles whose density is defined locally, and the expression for the en- 
ergy density can be interpreted as one describing a local equilibrium of 


‘dressed’ particles. The local excitation energy of a quasi-particle appears 
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as 
ép(r) =ep + > fp np (r). (7-103) 
p’ 


Compared to (7-82b), the local energy now depends on the position r. The 
momentum space gradient V,é,(r) continues to give the velocity (v,(r)) of 
the quasi-particle (now defined locally), while the position space gradient 
—V,ép(r), which is a new feature arising because of spatial inhomogeneity, 
describes a force that tends to push the quasi-particle towards regions of 


lower energy. 


With this background one can set up the transport equation pertaining 
to the excited quasi-particles, leaving aside the particles making up the 
ground state. The advantage of this approach is that one can, to start 
with, ignore the interaction between the excited quasi-particles since 
these are few in number at sufficiently low temperatures, and consider 
only the interaction between the quasi-prticles and those making up the 
ground state. Writing n,(r,t) = n®) + dnp(r,t) (the time dependence is now 


made explicit), the transport equation appears as 


Jdnp(r, t) 


ap Vr6np(r, t) + vp(r) — Vpn® - [So fpp'Vr6np(r,t)] =0, (7-104) 
p’ 


This compares with the Boltzmann transport equation to be taken up in 
section 8.3, where only the ‘free streaming’ terms pertaining to the ex- 
cited quasi-particles are included, with the special feature that only the 


diffusion force on a quasi-particle due to its interaction with the ground 
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state particles is included, and not the force exerted by the other quasi- 
particles. In other words, the quasi-particles are assumed to be inde- 
pendent of one another, interacting only with the ground state particles. 
This latter interaction has no net effect in the equilibrium state which is 
spatially homogeneous and appears only due to the inhomogeneity when 
the system departs from equilibrium. The effective Hamiltonian deter- 
mining the free streaming terms is given by the energy expression (é,(r)) 


in (7-103). 


The linear transport equation (7-104) assumes a relatively simple form 


when one makes use of the distribution dn, defined as 


dfip(r) = np — AO = ny — n©(E, — pw), (7-105) 


which describes the deviation from the local equilibrium characterized by 
the local quasi-particle energy é,(r). In this expression, n)(€) stands for 
the ideal gas Fermi distribution at T = 0. In terms of 6n,(r), one has 


np (r,t) 


Ot + V,dfp(r, t) -vp(r) = 0. (7-106) 


Even in this simple form where the interaction among the quasi-particles 
and the action of external forces have been ignored, the transport equa- 
tion is more complex than the corresponding free-streaming transport 
equation for a dilute gas (refer to chapter 8) since it is actually an integral 


equation in virtue of (7-103). 


The exclusion or inclusion of quasi-particle interactions in the transport 
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equation is determined by the collision frequency v. or, equivalently, on 
the collision time 7.. For non-equilibrium processes occurring on a char- 
acteristic time scale 7 << 7, (i.e., with characteristic frequency w >> 1), 
one has to consider the collision-less regime described above. On the 
other hand, for sufficiently slow processes (w << ™, T >> 7,), one has 
to include a term analogous to the collision integral, to be considered in 
connection with the Boltzmann equation in chapter 8, in the transport 


equation. 


Finally, the complete transport equation is to incorporate the effect of 
external force on the distribution of the quasi-particles. However, given 
the external force on a bare particle it is not easy to get at the force ona 
quasi-particle. Assuming that the force on a quasi-particle in state p is 
Fp, the complete transport equation appears as ([112]) 


Odnp(r, t) 


AE + Vr6fp(r, t) -vp(r) + Fp - Vpn?) = (tts). (7-107) 


where the third term on the left hand side is an approximation to the drift 
in momentum space due to the external force, and the right hand side 
stands for the rate of change of n, due to collision among quasi-particles. 
One or more terms of this equation may be ignored depending on the type 


of non-equilibrium process under consideration. 


For instance in the collision-less regime, where the time scale charac- 
terizing the non-equilibrium process is small compared to the collision 
time, and in the absence of external fields, one can make use of the sim- 


pler form (7-106) when, in the main, two distinct types of processes can 
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be identified, namely, those in which the quasi-particle distribution re- 
mains localized over some small region of the Fermi surface, and those 
others where the excitation is distributed around the entire Fermi sur- 
face. The latter type appears as collective modes and can be described, at 
sufficiently low temperatures, as an oscillation of the Fermi surface itself. 
The collective modes include, in particular, the propagation of zero sound 


predicted by Landau. 


A collective mode can be conveniently described in terms of the normal 
displacement of the Fermi surface at point p [112] where the spin index o 
is to be considered separately. If q denotes the wave vector characterizing 
the mode, and 0,¢ be polar angles with the direction of q chosen as the 
polar axis, then one can express the normal displacement in terms of 
0,¢,0. The modes in which the spins oscillate in phase with the normal 
displacement are decoupled from those in which the spins are out of 
phase, and are referred to as spin symmetric and spin anti-symmetric 
ones. As for the 0,¢ dependence of each of these, one can expand the 
normal displacement in terms of the spherical harmonics Y,,,,(0, ¢), when 
it turns out from the transport equation that modes with distinct values 
of m are decoupled from one another. Thus, finally, collective modes 
can be classified as spin symmetric and spin anti-symmetric ones with 
m = 0 (longitudinal’), m = 1 (‘transverse’), and so on. In this classification 


scheme, zero sound appears as the longitudinal spin symmetric mode. 


A collective mode gets damped under certain conditions when energy is 
transferred from the collective oscillations to individual quasi-particles. 


In particular, there arise resonances in which the velocity in the direc- 
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tion of q ee of a quasi-particle coincides with the phase velocity ) 
of the mode (recall that a collective mode consists of coherent oscilla- 
tions of all quasi-particles over the Fermi surface). Away from the res- 
onance condition, a net energy transfer takes place from the collective 
mode to individual quasi-particles when = happens to be slightly less 
than the phase velocity. Such damping of a collective mode is referred 
to as Landau damping by analogy to the analogous damping mechanism 
in classical plasma oscillations (in contrast, the collective modes under 
consideration are quantum phenomena where an essential role is played 
by the fermionic nature of the particles). Another mechanism of damping 
is that by collision among the quasi-particles which operates regardless 
of the relative magnitude of a and =. However, the Landau damping 
mechanism predominates when the quasi-particle velocity is less than 


the phase velocity. 


In contrast to the process of damping, there may occur a growth of a 
collective mode whereby it gets destabilized. In order that the Fermi liquid 
remains stable against the growth of collective modes, a set of conditions 
(referred to as the Pomeranchuk conditions) are to be necessarily satisfied 
by the interaction coefficients Fel, FP (refer to (7-85)) for 1 = 0,1,2,---, 


which read 
Fes 0141), BP Ss -@l1) @=0)1,9-+-). (7-108) 


As mentioned earlier, the conditions (7-89b), (7-96), (7-98) appear as par- 
ticular instances of this more general set of conditions necessary for the 


stability of the Fermi liquid. 
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Finally, we come to the process of collision among the quasi-particles 
that makes necessary the inclusion of the collision integral Z(n,) in the 
transport equation (7-107). This is analogous to the collision among par- 
ticles of a dilute gas, to be considered in the context of the Boltzmann 
equation in chapter 8, and assumes relevance for w << vy, i.e., for non- 
equilibrium phenomena occurring on a time scale large compared to the 
collision time among quasi-particles, corresponding to the so-called hy- 
drodynamic regime, where the transport equation can be made use of 
to set up balance equations of locally defined densities of number, mo- 
mentum, energy, and and spin of the quasi-particles. With reference to 
the balance equation, one can then arrive at microscopic expressions for 
such transport coefficients as the viscosity, thermal conductivity, and 
the spin diffusion coefficient worked out, among others, by Abrikosov 
and Khalatnikov ([112]). The setting up of the collision integral requires 
a knowledge of the two-body scattering amplitude for ‘incoming’ and ‘out- 
going’ quasi-particles with specified momenta and spin, which involves 
non-trivial quantum mechanical considerations. The derivation of the 
transport coefficients and the interpretation of the results, though anal- 
ogous to the approach followed in the kinetic theory of gases (refer to 


chapter 8), is outside the scope of this book. 


In this context, the propagation of sound (referred to as first sound so as 
to distinguish it from zero sound; the two, however, arise in mutually ex- 
clusive circumstances) is a process of interest in the collision-dominated 
regime defined by the inequality w << vr (i.e., frequency characterizing 


the sound is to be much less compared to the collision frequency v ~ 4), 
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where a local build-up of the density is quickly compensated by collisions 
and a wave, conforming to the description of a hydrodynamic mode, is set 
up. However, collisions play a dual role in the process — that of providing 
for a restoring mechanism and the one of causing dissipation. The former 
is responsible for the setting up of the wave, much like the oscillations 
characterizing the collective modes, while the latter results in damping. 
Indeed, looked at from the point of view of the collective modes and the ex- 
pansion in terms of the spherical harmonics mentioned above, one finds 
that the / = 0 and / = 1 modes are left intact by collisional damping while 


modes with higher / values are effectively damped. 


Among the values of m compatible with | = 0,1 the longitudinal mode 
m = 0 describes density oscillations along with, possibly, a uniform trans- 
lation of the fluid, and corresponds to sound waves. On truncating the 
Fourier-transformed transport equation for / > 2 with w << v, one obtains 
a pair of coupled equations (recall that modes with different values of / are 
not decoupled) from which one obtains the velocity of first sound given 
by the formula (7-95). In contrast, in the collision-less regime (w >> v), 
one has to retain all possible values of /, implying an infinite number of 
independent modes with distinct values of m, out of which the longitudi- 
nal mode (m = 0, compatible with all values of /) corresponds to the zero 


sound. 


Though the collisions play predominantly the role of providing for a restor- 
ing mechanism in the density oscillations in the case of first sound, they 
do cause viscous damping to a small extent, by the process of momentum 


diffusion. Thus, in summary, we have the damped first sound for w << v 
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and the damped zero sound for w >> v. A complete theory of damping 
over a sufficiently large range of frequencies (starting from w << v and 
extending up to w >> ke) including the transition between the collision- 
dominated and the collision-less regimes is outside the scope of this in- 


troductory exposition. 


7.6 Superconductivity 


7.6.1 Superconductivity: introduction 


Superconductivity is the phenomenon where a material is found to have 
zero electrical resistance (infinite conductivity) below some characteris- 
tic temperature (T., specific to the material under consideration), and to 
expel magnetic field from its bulk (B = 0, Meissner effect). While super- 
conducting materials come in numerous varieties with various different 
characteristic features — thus, there are superconductors of type I and 
type II, and high T, superconductors — the above two can be looked upon 


as basic ones. 


In the present section we look at a number of basic ideas in the BCS 
theory of superconductivity which has turned out be to of fundamental 
relevance in the entire vast field of theoretical and experimental work on 


superconductivity. 


In this, we follow, in the main, the presentation in [136], one that is lu- 
cid and detailed, well suited for an introductory exposition that the present 


chapter aims at though, at the same time, only a small fraction of the rich 
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content of the book can possibly be accessed by such an exposition; in ad- 
dition, [120] and [79] provide a survey of the subject with lucidly explained 
physical ideas and with a minimum of technical complexities. In the present 
book, we gloss over a number of aspects relating to the theory of supercon- 
ductivity; more detailed and thoroughgoing treatment is to be found in [44]. 


I also recommend [68], chapter 10. 


Fundamentally speaking, superconductivity is the result of the interac- 
tion among the assembly of conduction electrons in a metal (or, more 
generally, a metallic complex or compound), combined with the fermionic 
nature of the electrons. The Hamiltonian for the assembly of electrons will 
be written as the sum of a part describing independent electrons in the 


metal, and another representing the effective interaction among these. 


As mentioned above, this will involve idealizations and approximations, i.e., 
abstractions from actual systems, not all justifiable rigorously. Still, the 
approach outlined in the following paragraphs does provide an explanation 


of the principal features of superconductivity. 


Let us take up first the independent-electron model of metals. A number 
of thermodynamic properties of the assembly of conduction electrons in 
a metal can be explained under the simple assumption that these are 
non-interacting particles confined within a finite volume V (ultimately 
assumed to be infinitely large), obeying the FD statistics and, at suffi- 
ciently low temperatures, forming a degenerate ideal FD gas (refer to sec- 


tion 3.3.6.2). However, this simple model fails to account for a number 
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of important features characterizing the behavior of metallic substances, 
many of which can be accounted for by making the additional assump- 
tion of a weak periodic potential experienced by the electrons that arises 
as an effective residual influence resulting from the Coulomb interaction 
between the electrons (which, strictly speaking, is to be calculated in a 
self-consistent manner) as also from the interaction between the electrons 
and the ions in the crystalline lattice of the metal. However, the electrons 
can still be assumed, in an approximate sense, to experience the periodic 
potential independently of one another (see section 3.4.3; also, see the 


note below). 


Digression: The basic idea underlying the nearly free electron model of 


a normal metal. 


We consider, for the sake of simplicity, a monovalent metal at a very low 
temperature (we omit to address the question as to what the phrase ‘very 
low’ means in concrete terms) where one can, to start with, ignore the ther- 
mal vibrations of the ions making up the crystalline lattice of the metal, 
assuming instead that the ions form a fixed structure and their mutual 
Coulomb interaction contributes a constant term to the Hamiltonian of the 
system. The ionic vibrations, however, play an important role in the expla- 
nation of superconductivity. The effect of these vibrations can be accounted 
for in terms of phonons, i.e., the quanta of the vibrational modes where, in 
the first approximation, the phonons and the electrons can be assumed to 


constitute two independent systems. 


Discounting for the present the interaction between the phonons and the 
electrons (see sec. 7.6.3 below where we consider the phonon-mediated in- 


teraction between the electrons) one can take into account the effect, on any 
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single electron, of the Coulomb interaction between the electrons and that 
between the fixed ions and the electrons, by invoking Bloch’s theorem. The 
latter states that, in a periodic potential (as in the crystalline environment 
in a metal) a typical energy eigenfunction of the electron can be expressed 


in the form 


W(x) = eT ux (r), (7-109) 


where k is a wave vector specifying the momentum (or, more precisely, a 
quasi-momentum, since k shares the periodicity of the reciprocal lattice), 
and where the eigenfunction represents a modulated plane wave, the mod- 
ulation being described by the function u(r), which shares the periodicity 
of the lattice. The probability distribution corresponding to the above eigen- 


function being |u,(r)|?, the theorem implies that the potential due to the 


electron under consideration shares the same periodicity. We now imagine, 
for the single electron under consideration, the effect of N number of pos- 
itive ions (forming a fixed crystalline structure) and N — 1 number of the 
remaining electrons, where each of these N — 1 number of electrons gener- 
ates a periodic potential of its own, as a result of which the potential due 
to the N number of ions is screened by the Coulomb potential of the N — 1 
number of electrons. Since the screening occurs identically for all the N 
number of ions, the single electron under consideration feels only an ef- 
fective Coulomb potential due to a periodic array of N number of screened 
charges, each having a magnitude of (1 — 44+ =)% times the ionic charge. 


In other words, the effect felt by any single electron is that of a very weak 


periodic potential, as stated above. 


The simplified and intuitive argument given above can be made rigorous 
by self-consistent many-body calculations, but the result remains that the 
Coulomb interaction between the electrons and that between the electrons 


and the ions have the joint effect of producing a weak periodic potential 
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in which the individual electrons may now be considered to move indepen- 
dently of one another (since the Coulomb interaction between them has 


already been taken into account in a self-consistent manner). 


As mentioned above, the theoretical framework in which the electrons in 
a metal are considered to constitute a system of independent fermions 
in a weak periodic potential is quite successful in explaining a number 
of electrical and magnetic properties of normal metals (i.e., ones that are 
not in the superconducting state). In this theory, the electrons in the 
metal constitute an ideal gas obeying the FD statistics, where the latter 
describes the distribution of the electrons among discrete energy levels 
forming a set of energy bands. At sufficiently low temperatures, only 
a single band (the so-called conduction band) is relevant in describing 
the properties of the metal. Each energy level can accommodate two 
electronic states (the ‘Bloch states’) with opposite spins, and each state 
is characterized by a wave vector k and a spin index o that can take two 
values denoted symbolically as ‘up’ (t) and ‘down’ ({) — in the abstract 
Dirac notation, such a state is denoted by the symbol |kc). At T — 0, all 
states up to the Fermi level ¢, are occupied while the state with energies 
€ > es remain vacant. At higher temperatures some of the electrons below 
€p are thermally excited to higher energy levels in accordance with the 
Fermi distribution. The latter is essentially the same as described in 
sec.. 3.3.6.2, but one in which the electronic mass m replaced with an 
effective mass m,.¢ that represents the effect of the periodic potential on 
the electrons (refer to section 3.4.3). Generally speaking, the effective 


mass is represented by a tensor, which reduces to a scalar in the case 
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of cubic symmetry. In the following, we will assume that the effective 
mass is a scalar and omit the sub-index ‘eff for the sake of simplicity. 
Additionally, one has to remember p = fk is a quasi-momentum that 
differs from the momentum considered in the case of the ideal gas in 


sharing the periodicity of the reciprocal lattice. 


However, the self-consistent independent-electron model falls short of ex- 
plaining the phenomenon of superconductivity, which has been found to 
be accounted for by a specific type of attractive interaction between the 
electrons, to be outlined below, not included in the independent-electron 
model. This interaction is again of a many-body nature, i.e., one that 
arises only in virtue of the presence of other electrons and ions forming 
the environment in which the interaction arises. In the rest of this section 
we will first look at a number of phenomenological features of supercon- 
ductivity (sec. 7.6.2), many of which are explained by the phenomenolog- 
ical London theory (sec.7.6.2.3) and then attempt at a qualitative under- 
standing of how an effectively attractive electron-electron interaction can 
arise. This will be followed by the outline of an analysis which implies the 
formation of a bound state (referred to as a Cooper pair) between a pair of 


electrons near the Fermi surface of the strongly degenerate Fermi gas. 


1. The Fermi surface is a surface in the space of wave vectors k (or, equiv- 
alently in the space of quasi-momenta p = fk), where the possible val- 
ues of k all lie within a region inside the Brillouin zone in the reciprocal 
lattice, with the Fermi surface constituting the boundary of the region. 


The energy corresponding to an electronic state with its wave vector 
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lying on the Fermi surface is ep, the Fermi energy. 


2. The independent electron model of a metal essentially describes the 
system of electrons as a Fermi liquid considered in sec. 7.5.2. However, 
when the phonon-mediated electron-electron interaction is taken into 
account (see below), the excitations at T > 0 differ in nature from those 
of a Fermi liquid since, instead of the Landau type quasi-particles, the 


excitations now appear in the form of Cooper pairs. 


Finally, we will see how the formation of Cooper pairs leads to the pos- 
sibility of the condensation of a macroscopic number of electrons of the 
Fermi gas into the so-called BCS ground state at sufficiently low temper- 
atures because of a finite energy gap separating this state from the higher 
excited states. This requires that we look at the excitation spectrum of 
the system of electrons, which will also determine the temperature de- 
pendence of the energy gap and the basic thermodynamic functions and 
finally, the superconducting transition temperature. All this will consti- 
tute the skeleton of the BCS theory, which explains a number of macro- 
scopic properties of superconductors. We will then wind up by outlining 
the Landau-Ginzburg approach, a phenomenological theory that explains 
a large body of observations on superconductors and, at the same time, 


has intimate ties with the microscopic BCS theory. 
The brief (and sketchy!) analysis we present below is relevant to the so- 


called classical or ‘conventional’ superconductors, and not to the high-T, 


superconductors that constitute a distinct type. 
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7.6.2 Superconductivity: phenomenology 
7.6.2.1 Zero resistance 


The phenomenon of superconductivity involves a phase transition at a 
certain critical temperature T, (depending on the material under consid- 
eration) whereby the electrical resistance vanishes as the temperature is 
lowered below T.. The explanation of this phenomenon is provided by the 
BCS theory, as outlined in sec. 7.6.5.4. A large number of metals and 
alloys are known to exhibit the superconducting transition at sufficiently 
low temperatures, and numerous metallic compounds and complexes be- 
have likewise. Fig. 7-8 depicts schematically the loss of resistance of a 
pure metal (solid curve) below its critical temperature where the varia- 
tion around TJ. is seen to be remarkably sharp. The dotted curve shows 
the effect of a small impurity concentration, due to which the transition 
becomes less sharp, while the transition temperature is not affected ap- 


preciably. 


However, the loss of resistance notwithstanding, a superconductor can 
carry a current injected into it, where the current can flow in a closed 
path for an indefinite time. A current can also be generated in a super- 
conducting ring by changing the magnetic flux linked with it produced by 
an external magnetic field. In this case, the flux produced by the induced 
current keeps the total flux linked with the closed circuit at a constant 


value. 


It is of importance to note that the resistivity of a superconductor below 


its critical temperature is zero only for direct currents flowing through it, 
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resistivity as a fraction of 
normal resistivity 


= T 


Figure 7-8: Depicting the sharp variation of resistivity of a pure metal at its critical 
temperature, below which it becomes superconducting; the dotted curve depicts the 
effect of a small impurity concentration (schematic) due to which the transition is seen 
to be less sharp, while the transition temperature is not affected appreciably; based 
on [120], figure 1.3. 

for which there occurs no voltage drop or energy dissipation. The situa- 
tion is different in the case of a changing current, and the AC resistivity is 
found to be non-zero. An intuitive explanation of this phenomenon is ob- 
tained by imagining the conduction electrons to be divided into two types 
(the ‘two-fluid model’) — one being the ‘super-electrons’ and the other the 


‘normal’ electrons, of which the former is directly responsible for super- 


conductivity (i.e., zero resistivity) while the latter is involved in normal 
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conduction. When a direct current of constant magnitude flows through 
the superconductor, no voltage drop is produced (since otherwise the re- 
sulting electric field would cause the current to grow unboundedly; in an 
actual electrical circuit, the current is limited by the source resistance) 
while, in the case of an AC voltage, the super-electrons carry an inductive 
current (due to their inherent inertia) with a phase lag and, at the same 
time, the normal electrons carry a resistive current (in addition to a pos- 
sible inductive current caused by the inductance of the current-carrying 
circuit), attended by energy dissipation. Typically, however, only a very 
small fraction of the alternating current is carried by the normal elec- 


trons, and the resulting dissipation is also negligible for most purposes. 


7.6.2.2 Perfect diamagnetism: the Meissner-Ochsenfeld effect 


A superconductor is not simply a perfect conductor — it has striking 
magnetic behavior sharply distinct from those of a perfect conductor. As 
indicated in sec. 7.6.2.1, the magnetic flux linked with any closed path 
within a perfect conductor remains unchanged even when external condi- 
tions are changed (e.g., removal of applied field, or cooling) which implies 
that inside the conductor, the rate of change of the flux density is always 


Zero: 


perfect conductor : B=0. (7-110a) 


In a superconductor, on the other hand, a stronger constraint holds, 
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namely, the flux density is identically zero within its bulk: 


superconductor : B = 0. (7-110b) 


In other words, a superconductor is a perfect diamagnet. This crucial 
feature of superconductivity was discovered by Meissner and Ochsenfeld 


and is commonly referred to as the Meissner effect. 


If a conductor were to lose its resistance on cooling to a low temperature 
without the Meissner effect making its appearance, then its state at such 
a low temperature in the presence of an external field would not be one of 
thermodynamic equilibrium since it would depend on the history as to how 
the temperature and the field strength are brought to their given values 
(refer to [68], chapter 10). The Meissner effect, on the other hand ensures 
that the state of a superconductor is a unique one (with zero flux inside the 


material), regardless of the history. 


The Meissner effect is observed only when a relatively weak magnetic 
field is applied to a superconductor — the field does not penetrate to the 
bulk of the material. If, on the other hand, one applies a sufficiently 
strong magnetic field H > H., where H, is referred to as the critical field 
strength, then the material regains the features of a normal conductor. 
The process of transition from the sperconducting to the normal state by 
the application of a magnetic field of sufficient strength is a reversible 
one, and the superconducting state is regained when the field strength 


is reduced, there being no dissipation of energy if the field strength is 
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changed quasi-statically. 


In this context, one distinguishes between type I and type II supercon- 
ductors (refer back to sec. 7.6.1). A type I superconductor is character- 
ized by a single (temperature-dependent) critical field strength H., above 
which the material in question becomes a normal conductor, as men- 
tioned above. A type II superconductor, on the other hand, is character- 
ized by two values of the critical field strength, viz., a lower critical field 
H., and an upper critical field H... For applied field H < H,.,, the ma- 
terial exhibits features of the Meissner phase with complete expulsion of 
the magnetic field from its bulk, while for H., < H < H.2, one observes a 
mixed phase (also referred to as the ‘Shubnikov phase’) in which there ap- 
pears alternating regions in the bulk of the material with features of the 
superconductor (zero flux density) and normal conductor respectively. 
With H > H,.., the material becomes a normal conductor. Pure metals are 
mostly found to belong to Type I, while metallic alloys are generally found 
to belong to Type II. In the mixed phase of a type II superconductor, mag- 
netic field can penetrate the bulk of the material in the form of flux tubes 


in regions separated from one another by those of zero magnetic field. 


Fig. 7-9(A) gives the phase diagram of a type I superconductor in the 
T-H plane, where the phase depends on the temperature (7) and the 
magnetic field strength (H) that may be applied to a sample, and where 
the shaded region corresponds to the superconducting phase while the 
normal conducting phase corresponds to relatively large values of T and 


H. The boundary between the two regions gives the variation of H. with 
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Figure 7-9: (A) Phase diagram of a type I superconductor (schematic), depicted with 
temperature T and applied field strength H as the relevant parameters; the shaded 
region corresponds to the superconducing phase, and the region above the border curve 
(corresponding to the variation of H, as a function of T) to the normal phase; (B) phase 
diagram of type II superconductor (schematic), showing the existence of three phases, 
namely, the Meissner phase (no magnetic field in the interior of the material), the mixed 
phase (alternating regions of zero and non-zero field strength, the latter associated with 
flux tubes), and the normal phase. 


T, an approximate formula for which is 
yap (7-111) 


For comparison, the phase diagram of a type II superconductor is de- 


picted in fig. 7-9(B). 


It may be mentioned that the phase transition from the normal to the 
superconducting phase at T = T, in the absence of a magnetic field is one 
of the second order, associated with a finite discontinuity in the specific 
heat (refer to fig. 7-15 in sec. 7.6.5.3), the latent heat of the transition 
being zero. In the presence of an applied magnetic field, however, the 
transition is one of the first order, associated with a non-zero latent heat, 


when the applied field is made to vary across the critical value H,.(T). 
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In order that the magnetic field strength inside the bulk of a supercon- 
ductor may be zero regardless of the field strength in the region external 
to the material, screening currents must flow on its surface. However, 
these screening currents are not confined to a surface sheet of zero thick- 
ness and, instead, flow through a very thin surface layer. Accordingly, 
the magnetic field also penetrates into a thin surface layer while the bulk 


of the superconductor continues to be a region of zero magnetic field. 


In a type II superconductor, on the other hand, flux tubes get implanted 
in the bulk of the material for applied field (H) of intermediate strength 


(H., < H < H..). The phase diagram in this case is shown in fig. 7-9(B). 


I conclude this section by pointing out that the critical field strength H, 
(where the temperature dependence is left implied; at times, H, is meant 
to denote the critical field at T — 0; the default type considered here is 
type I) gives us a means to obtain the difference in the free energies of 
the normal and the superconducting phase of a material, both referred 
to at the same temperature and pressure and at zero strength of the 
applied field. Since the Meissner effect is reversible, this difference is 
accounted for by the magnetic field energy of the normal phase at applied 
field strength H., since the magnetic energy of the superconducting phase 
is zero. In other words, the difference between the two free energies (per 


unit volume, expressed in SI units) is simply 


1 


ee _ FF aparconductine — sHoHe(T), (7- 1 12) 


(where the permeability of the material is assumed to be unity, as is 
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commonly the case; we will consider here the supeconductivity of non- 
magnetic materials) which can be looked upon as a measure of the con- 
densation energy from the point of view of the microscopic BCS theory 
(see sec. 7.6.5 below), related to the energy gap resulting from the forma- 
tion of Cooper pairs. Indeed, formula (7-112) is of great value in checking 
the validity of expressions for thermodynamic functions of a supercon- 


ductor as derived from the BCS theory. 


The critical field strength featuring in (7-112) is at times referred to as 
the thermodynamic critical field, and is the same as the critical field in 
the case of a type I superconductor. For a type II superconductor, on 
the other hand, the thermodynamic critical field lies in between the lower 


and upper critical fields H.,, He. 


7.6.2.3 The London theory 


In the electrodynamics of material media, Maxwell’s equations are sup- 
plemented with constitutive relations that apply to some particular ma- 
terial under study, thereby constraining the scope of the Maxwell equa- 
tions. In the same spirit, F. London and H. London set up a pair of 
phenomenological equations that constrained the scope of the Maxwell 
equations (while being consistent with it) to the particular case of super- 


conductors. These equations read ( [68], section 10.2) 


E = \,js, B= —Arcurl jz, (7-113a) 


where a dot denotes the partial derivative over time, and \, is a phe- 
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nomenological constant, given by 


Deere (7-113b) 


Nge2 


In the expression for \;,, e and m stand for the electronic charge and mass, 
while n, is taken to be the number density of ‘super-electrons’, i.e., those 
electrons that are responsible for the dissipationless current density j, 
in the superconductor (recall the two-fluid model briefly introduced in 
sec. 7.6.2.1 where the electrons in a superconductor are notionally clas- 
sified into two groups — those responsible for maintaining the zero field 
condition in the bulk of the superconductor are said to constitute the 
‘super-current’, which is dissipation-less and flows in a surface layer of 
the material; the rest are responsible for normal conduction involving 


dissipation). 


The London equations, supplemented by the Maxwell equation 


curl B= fig}es (7-114) 


(the displacement current can be ignored unless one looks at high fre- 
quency AC variations) describe the electrodynamics of a superconduc- 
tor. The Maxwell equation curl E = —B is implied in the London equa- 


tions (7-113a). Formula (7-114), along with the second London equation 
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leads to 


2B = OR 
V BB 
V?j=- ia: (7-115) 


These formulae describe the penetration of surface currents and mag- 
netic fields within the bulk of a superconductor (refer back to sec. 7.6.2.2). 
For instance, if a magnetic field of strength H,(= os) is applied parallel to 
the surface of a superconductor, (say, along the x-axis of a co-ordinate 
system), then one can use the one-dimensional form of the first equality 
in (7-115) to conclude that the field penetrates to a small distance within 
the bulk of the material, its variation with the distance z measured in 
a direction normal to the surface (the z-axis of the co-ordinate system) 


being 
H = Hye *, (7-116a) 


while the screening current flowing parallel to the surface (in the x-y 


plane) is similarly given by 
j= hoe ™, (7-116b) 
where jy is the screening current at the surface, and 


p= . (7-116c) 
Lo Hons€ 


is referred to as the London penetration depth. Typical values of the pen- 
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etration depth are of the order of 100A. In other words, the super-current 
and the magnetic field penetrate to within a very small depth within a su- 
perconductor, decaying rapidly with the distance away from the surface. 
This can be looked upon, in essence, as an explanation (though, one at 
a phenomenological level) of the Meissner effect (see 7.6.5.6 for further 


elaboration). 


Phenomena associated with superconductivity can be explained at vari- 
ous levels of generality, the London equations providing one such level. 
The BCS theory to be outlined below (sec. 7.6.5) is, to date, the most gen- 
eral microscopic explanation, with which the London theory is consistent. 
At an intermediate level between the two is the Landau-Ginzburg theory 
(sec. 7.6.6), which provides a remarkably successful explanation of su- 
perconductivity, based on an incisively intuitive understanding of phase 


transition phenomena in general. 


The parameter n, in the London theory is in the nature of a phenomeno- 
logical one, since the idea of super-electrons is a qualitative and notional 
concept, which becomes more concrete in the light of the BCS theory. 
The latter is based on the concept of Cooper pairs, which means that n, 
should be equated to “* where n, is the number density of electrons in- 
volved in the formation of Cooper pairs. This, in turn, is a temperature 
dependent parameter, in consequence of which the penetration depth A, 


is temperature dependent as well, described empirically by the formula 


Ay(T) © Ay(0)[1 — eae (7-117) 


Cc 
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7.6.2.4 Type II superconductivity 


As already mentioned, type IJ superconductivity is characterized by a 
mixed phase (also referred to as the Shubnikov phase) in addition to the 
Meissner phase and the normal phase, which is a thermodynamically 
stable phase for applied fields of intermediate strength (H., < H < H.), 
characterized by the penetration of magnetic field within the bulk of a 


superconductor. 


The distinction between the type I and type II superconductors is ex- 
plained with reference to two distinct length scales characterizing a su- 
perconductor, namely, the London penetration depth A; introduced in 
sec. 7.6.2.3, and the coherence length € that makes its appearance within 
the framework of the Landau-Ginzburg theory (see sec. 7.6.6 below). In 
physical terms, € is a characteristic length over which the number den- 
sity (n,) of the super-electrons changes from a low to a high value at the 
boundary of a normal and a superconducting phase. The ratio of the two 
characteristic lengths defines the so-called Landau-Ginzburg (LG) param- 


eter 


pa. (7-118) 


One has the distinguish between the Pippard coherence length and the 
Landau-Ginzburg coherence length (£, see sec. 7.6.2.4), of which the former 
is a temperature-independent parameter and the latter is one that diverges 
at T — T, like (T—T,)~?. The two are related in the sense that they converge 
for T << T,. It is the GL coherence length that features in formula (7-118). 


Since the penetration depth also diverges at T > T, like (T — T.)~? (refer 
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to (7-117)), the ratio « is nearly independent of temperature. 


The Pippard coherence length measures the way the electrons forming the 
Cooper pairs are correlated so that is is more probable to find two pairs at 
relatively small distance than at larger distances from one another ([120], 
chapter 9). On the other hand, the coherence length € mentioned above (see 
also sec. 7.6.5.6) measures the distance over which the density of Cooper 
pairs rises from zero value at the surface of a superconductor to the value 
in its interior corresponding to the superconducting phase in which the 


material becomes free of magnetic flux. 


This parameter controls the surface energy between a normal and a su- 
perconducting region in a material. In the case of a type I superconductor, 
the surface energy is positive, corresponding to a relatively small value 
of «. A type II superconductor, on the other hand, is characterized by a 
relatively large value of «, and the surface energy is negative, implying 
that the system is unstable against the formation of domains of mag- 
netization separated by demagnetized regions. The normal (magnetized) 
and the superconducting (diamagnetic) regions form a geometric pattern 
corresponding to a minimum free energy of the material as a whole. The 


length scale characterizing the pattern is ~ 10~°cm. 


Type II superconductivity was predicted by Abrikosov who deduced that 


the transition value of « demarcating the two types of superconductivity 


woe For a material with x > Fs the discontinuous (first order) 


phase transition across H, is replaced with a continuous one when the 


Was kK = 


applied field is made to cross the lower critical value H,,. 
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Magnetized regions can also be formed within a superconductor of type I, 
depending on the geometry of the sample. However, these regions are of 
macroscopic dimension, distinct from the magnetized flux tubes in a type II 
superconductor, the formation of the latter being determined by factors of 
microscopic origin. The existence of magnetized and demagnetized regions 
of macroscopic dimensions in a type I superconductor is said to constitute 


an intermediate state, as distinct from the mixed phase of a type II material. 


Another of Abrikosov’s major contributions was the elucidation of the 
nature of the magnetized regions in a type II superconductor where the 
magnetization appears as a regular array of quantized flux tubes (also 
referred to as vortex tubes), analogous to the quantized vortices in a su- 
perfluid, introduced by Onsager (refer back to sec. 7.4.5). Each vortex 


“, and is supported within a 


carries a quantum of magnetic flux 6) = = 


superconducting (demagnetized) region by circulating currents. In other 
words, a vortex tube is made up of a core with its associated vortex cur- 
rent. Adjacent cores experience a repulsive magnetic force which plays 
a role in determining the lattice structure of the vortex tubes. Fig. 7-10 
depicts schematically a number of such vortex tubes in a type II super- 


conductor. 


In this introductory exposition, we will not enter into detailed theoretical 
considerations relating to the vortex tubes and to superconductors of type 
II, and will confine ourselves to stating only these few phenomenological 
aspects. The Landau-Ginzburg theory, to be briefly outlined in sec. 7.6.6 


below, constitutes a versatile theoretical framework for the description 
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Figure 7-10: Vortex tubes (schematic) within a type II superconductor (dotted lines 
demarcate part of a sample); circulating vortex currents are shown; the flux lines get 
curved as they emerge from the sample into the ambient space. 


and explanation of type I and type II superconductivity. As shown by 
Gorkov, the LG theory, in turn, is related to the microscopic BCS theory 


as a limiting form of the latter for T ~ T.. 


7.6.2.5 High-7. superconductors 


High-T.. superconductors were discovered in certain groups of ceramics in 
the nineteen eighties and opened up a vast area of theoretical and experi- 
mental research engendering enormous possibilities. While the ‘classical’ 
superconductors are characterized by transition temperatures < 20K, the 


corresponding temperatures for the high-T, materials are ~ 100K. 


Examples of high-T. superconducting ceramics are: YBa,Cu3O7 (‘YBCO’, 
Te = 93K), Bio(Sr2Ca)Cuy,Og (BSCCO’, Te = 110K), and TlBa,Ca2Cu30,0 


(TBCCO’, T. = 125K). All these materials have a relatively low oxygen con- 
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tent and have a layered crystal structure, with several neighboring layers 
of copper oxide separated from the next group of copper oxide layers by 
oxide layers of other metals. The property of superconductivity is, in the 
main, associated with the copper oxide planes since altering the other 


planes is found not to have much effect. 


Subsequent to the initial investigations on the ‘cuprates’, as these ceram- 
ics are referred to, a number of other classes of materials have been found 


to exhibit superconductivity with even higher transition temperatures. 


The mechanism underlying high-T, superconductivity is not well-understood. 
The BCS theory and the Ginzburg-Landau phenomenological approach 
are believed to have some relevance, but a satisfactory understanding 
based on fundamental principles does not seem to be at hand. As far as 
the BCS theory is concerned, a number of major modifications (in com- 
parison to the standard version to be outlined in sec. 7.6.5 below) appear 
to be indicated in order that it may be of use in laying the foundations 
of a satisfactory theory. In the present introductory exposition, however, 
an account of the theoretical work in such directions will not be included 


(refer to [136] for an introduction). 


7.6.2.6 Josephson effect 


A Josephson junction is formed by two superconductors forming a con- 
tact, with a very thin insulating layer in between the two so that ‘normal’ 
current does not flow across the junction. However, there can occur a 
tunneling of Cooper pairs, due to which the coherent macroscopic wave 


functions in the two superconductors become coupled to each other. 
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The concept of the macroscopic wave function is one of great relevance in 
understanding the behavior of superconductors. As we will see (sec. 7.6.5), 
below the transition temperature (7.) in a superconductor, there occurs 
a Bose-Einstein-like condensation of a macroscopic number of electron 
pairs into a single state, where all the pairs are correlated with one an- 
other. At T = 0, all the electrons get bound into pairs, with the pairs inter- 
locked into one single coherent structure, described by the BCS ground 
state. Depending on whether or not the Cooper pairs have a common 
non-zero center-of-mass momentum (where the latter causes a supercur- 
rent to flow, see sec. 7.6.5.4), the BCS ground state acquires a phase ¢ 
that determines a number of macroscopic features characterizing a su- 


perconductor (refer to sections 7.6.5.6, 7.6.5.7). 


In particular, it determines the i-V characteristic of a Josephson junc- 
tion. If a supercurrent i, flows across a Josephson junction from, say, 


superconductor ‘A’ to superconductor ‘B’, then 


ia = 1,8my, (7-119a) 


where i, is the critical value of the Josephson current and 


y=¢p—-¢da—-— | A-dl. (7-119b) 


, Here ¢4,¢p are the phases of the macroscopic wave functions of the 
semiconductor ‘A’ and ‘B’ respectively, while the third term on the right 
hand side involves a line integral of the magnetic vector potential from 


‘A’ to ‘B’. The expression y is referred to as the gauge-invariant phase 
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difference since it is invariant under a gauge transformation of the vector 
potential as A > A+Vvw and a simultaneous transformation of the phase 
of the macroscopic wave function by an additive term as ¢ > ¢+ Ze), 


Such a transformation does not have discernible physical consequences. 


The current equation (7-119a), referred to as the first Josephson equa- 
tion, is to be complemented with one relating the rate of change of the 
gauge-invariant phase difference to the DC voltage (V) that may be ap- 


plied to the two sides of the Josephson junction which reads 
y= rad (7-119c) 


and is referred to as the second Josephson equation. When no voltage 
is applied (V = 0), the Josephson current i, is constant, whose maximum 
possible value is i.. On the other hand, for a non-zero DC voltage VY, 7 
increases linearly with time, and an alternating supercurrent flows across 


the Josephson junction, with a frequency 


pe a, (7-119d) 


where ®,) = x represents the elementary flux quantum, as explained in 


sec. 7.6.5.7. 


As for the derivation of the two Josephson equations, refer to [79]. 


The Josephson effect, summarized in equations (7-1 19a)-(7-119d), is one 


of great theoretical and practical relevance. However, it will not be pur- 


859 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


sued farther in this introductory exposition on the basic principles of 


statistical mechanics. 


7.6.3 The origin of the attractive electron-electron in- 


teraction 


The Coulomb interaction between two electrons in vacuum, given by 


e2 


Vir) = Tee (7-120a) 
(in usual notations) is instantaneous, with a Fourier transform 
e2 
V(k) = ine’ (7-120b) 
corresponding to a dielectric function 
ae (7-120c) 


which is the permittivity of free space. 


More generally, the dielectric function for a material medium is a function 
e(k,w) of variables k,w, where k stands for the wave vector of an electric 
field, induced in the material by an external field of frequency w. The 
induced field, which can be looked upon as the response of the medium 
to the external field, differs from the latter, due to the screening effect of 
the electrons and the positive ions in it where, generally speaking, the 


response involves a delay. In the case of electrons in a metal, the dielec- 
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tric function can be worked out, first by considering a stationary periodic 
array of positive ions around which the assembly of electrons arranges 
itself in a stable configuration, and then modifying the resulting func- 
tion by including the effect of collective vibrations of the periodic array, 
such a conceptual separation being meaningful because of two sharply 
distinct time scales characterizing the ionic and electronic motions, es- 
sentially due to the order-of-magnitude difference between the ionic and 


electronic masses. 


The electronic charge configuration around a fixed distribution of posi- 
tive charges can be worked out in the Thomas-Fermi theory ([91]), a semi- 
classical theory whose condition of validity is that the length scale char- 
acterizing the variation of the charge density (and the potential resulting 
from the charge distribution) is to be large compared to the relevant Fermi 
length. The spatial configuration of the electrons in the system produces 
a dressing of the ionic charge distribution, and the resulting charge dis- 
tribution in the system partially screens the interior of the metal from 
external fields. This, in turn, implies a modified dielectric function that 
now depends on k (but is still independent of w), and a modified effec- 
tive interaction between a pair of electrons in the system as compared 


to (7-120b) where one now has 


2 2 1 


= ne ey REE ten 


Because of the assumption of a fixed distribution of the ionic charges, the 
dielectric function is independent of w since the dressing of the positive 


charges by the electrons, and hence the response to an external field, 
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occurs instantaneously. In the above expression, the constant ky stands 
for a characteristic inverse length describing the electronic screening of 
the positive charges (in turn, the electronic charge distribution may be 
looked upon as being ‘dressed’ by the positive charges) as a result of 
which the interaction between the electrons is effectively described by a 


Yukawa-like screened Coulomb potential (~ ), 


If one now considers the motion of the positive charges which, in the case 
of a periodic array, takes the form of collective oscillations, the electronic 
dressing of the ions referred to above becomes time-dependent, and the 
response of the entire system of ions and electrons to an external field 
becomes a delayed one. The screening of the interior of the material to 
the external field is now described by a dielectric function that depends 
on the frequency w in addition to the wave vector k, and is found to be of 
the form (see [1], chapter 26) 


ko 


e(k,w) = eg(1 + 72 )(1 Se) (7-122) 


where w(k) stands for the frequency of collective oscillations of the peri- 
odic array of positive ions. Incidentally, the theory underlying the for- 
mula (7-122) looks at the positive charge distribution as a continuous 
one and the collective oscillations of the ions as those of a continuous 
medium. Thus, our allusion to the periodic array of ions is not to be 
taken in the sense of a discrete distribution of charges, and the collective 


oscillations mentioned above refer to long wavelength elastic waves. 


The effective interaction between electrons in the metallic medium now 
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depends on k and w, and appears as 


2 2 


Vig (k, Ww) = : = 


€q (k2 + k2)(w? — w(k)?) ates) 


One can see that the effective interaction changes sign as w crosses w(k) 
from above, implying that, owing to the delaying effect of the ionic mo- 
tions, the effective interaction between screened (or ‘dressed’) electrons 
in a metal may change over from a repulsive to an attractive one. This 
phenomenon of cross-over from a repulsive to an effectively attractive in- 
teraction between electrons in a metal is referred to as ‘overscreening’. 
From the physical point of view, a moving electron, in following an oscil- 
lating electric field, may leave a wake of (an oscillating) positive charge 
density behind it which, in turn, may attract a second electron, given the 


correct ordering of the relevant frequencies. 


The possibility that an effectively attracting interaction may appear be- 
tween electrons in a polarizable material due to the screening effect of 
opposite charges was pointed out by Frdélich, whose idea was made use 


of in the development of the BCS theory of superconductivity. 


One can arrive at an effectively attractive interaction between electrons by 
the use of perturbation theory ([44], chapter 7) where the influence of the in- 
teraction between electrons and phonons is taken into account in successive 
orders of perturbation in determining the matrix element of the total inter- 
action Hamiltonian (including the interaction among the electrons) between 
plane wave states of the electron. The terms appearing in the perturbation 


expansion are classified conveniently by the use of Feynman diagrams. One 
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again comes out with the result that the phonon-mediated electron-electron 


interaction can, under appropriate circumstances, be an attractive one. 


However, the phonon-mediated interaction is to be looked upon as one 
possible mechanism of an effectively attractive force between electrons 
in a metallic environment. Other mechanisms, where additional factors 
are involved in the origin of the attractive interaction cannot be ruled 
out in superconductors of diverse descriptions. The phonon mechanism 
appears to be responsible in the case of most known superconductors, as 


can be inferred from the isotope effect mentioned in sec. 7.6.5.3. 


7.6.4 Bound state: the Cooper pair 


Starting from the observation that the effective electron-electron interac- 
tion in a metal can be an attractive one, Copper established the important 
result that a pair of electrons close to the Fermi surface can form a bound 
state, even in the case of an arbitrarily weak attractive interaction. The 
result depends in an essential way on the fermionic nature of the elec- 
trons and on the fact that the interaction between the two electrons takes 
place in the background of the Fermi sea (comprised of filled up single- 


particle levels with momenta up to pp, the Fermi momentum). 


The effective Hamiltonian for the system of electrons (making up the 


Fermi sea and the pair under consideration) is written in the form 


1 
A= gach al are + OV yy ee go ka’ Ako, (7-124a) 


kk’qao’ 
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where we have made use of the formalism of second quantization, out- 
lined in sec. 7.2, in the momentum representation (or, more precisely, in 
the representation in terms of Bloch states characterized by Bloch mo- 
menta k). In this expression, 4,, a/,, stand for single-particle annihilation 
and creation operators in appropriately defined Bloch states (with energy 
e(k) and spin o) for non-interacting electrons in the metal, and the second 
term on the right hand side represents the contribution to the Hamilto- 
nian by the effective electron-electron interaction (refer back to (7-21) in 


sec. 7.2 and the background explained therein). 


A detailed analysis of the phonon-mediated electron-electron interaction 
in a metal shows that an attractive interaction between a pair of electrons 
in the states (k, a) and (k’,c’) arises when the two momenta (|/k|], |Ak’|) dif- 
fer from the Fermi momentum pr by only a small margin, so that ¢(k) — e¢ 
is at most of the order of hwp where wp is the Debey frequency charac- 
terizing the acoustic phonons in the crystalline lattice of the metal under 


consideration. 


Cooper made the simplifying assumption that only electrons with Bloch 
energies ¢«(k), «(k’) satisfying |e(k) — «(k’)| < hwp participate in the interac- 
tion and that, moreover, v(q) is constant (v(q) = —g, g > 0) in respect of 
matrix elements of the interaction Hamiltonian between states with en- 
ergies satisfying the above constraint. In other words, (7-124a) is further 


simplified to 


H= alk Jado ~ 5 = > Gage qo! Ak'o (Ako (g > 0), (7-124b) 


Papal 
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where the summation over k, k’, and q are assumed to conform to the 
constraint mentioned above (namely, ¢(k), €(k’), e(k + q), e(k’—q) have to be 
all within a range hwp above ef). With this Hamiltonian in place, Cooper 
looked for a two-electron bound state in the form of an antisymmetrized 
superposition of product states of the form |k +) ®|—k |) alongside the 


filled up Fermi ‘sea’ represented by the vector, say, 


PF), Le, 
|2) = So oxahat y|F), (7-125) 
k 


where, once again, the summation over k has to conform to the energy 
constraint mentioned above. In this assumed form for |7), a, are a set 
of coefficients to be determined by requiring that |i) should be an eigen- 
state of the Hamiltonian (7-124b). The fact that only those product states 
contribute to the sum in (7-125) that are made up of single-particle states 
with opposite momenta and spin follows from the fermionic nature of the 
electrons. The appearance of the antisymmetric spin state (referred to 
as the spin singlet) is analogous to the occurrence of the singlet state in 
bound configurations arising in various contexts in nuclear, atomic and 


molecular physics. 


It is of importance to note with reference to (7-124a), (7-124b) that the 
effective coupling strength g describing the electron-electron interaction 
scales as } in the thermodynamic limit. Defining a normalized strength 
go AS go = GV, the results stated below can all be expressed in terms of gp 
rather than g. Indeed g is a dimensional constant having the dimension of 
energy divided by volume. As we will see, the principal results of physical 


relevance involve the product guy, where vy, the density of states at the 
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Fermi level (refer to (7-128b) below), scales like V, and has the dimension 
of volume divided by energy. All the results stated below can equally well 
be represented in terms of gon, where nr stands for the density of states 


per unit volume at the Fermi level. 


1. The situation being considered here is one involving two electrons 
added to the filled up Fermi sea at T = 0, with opposite spins and 
with momenta just above the Fermi momentum interacting with each 
other in a manner as outlined above. It is assumed that these interact 


with the filled Fermi sea only in virtue of the Pauli principle. 


2. If, instead of the spin singlet state one considers a spin triplet then the 
space part of the total wave function has to be antisymmetric in the 
spatial co-ordinates. In the case of an attractive interaction the overlap 
integral (featuring in the interaction energy) for such a spatially anti- 
symmetric combination has a smaller value when compared to the 
overlap integral for a spatially symmetric combination that arises in 
the case of a spin singlet. This implies that the spin singlet is favored 


in the formation of a bound state. 


3. The magnitude of the two momenta in the product states featuring in 
the superposed state considered above have been assumed to be the 
same in order to arrive at a simple approximation to the bound state 
we are looking for, since we know that the two momenta can differ 
only to a very small extent owing to the energy constraint mentioned 
above (le(k) — e(k)'| < hwp), since the Debye frequency is typically or- 
ders of magnitude small compared to the frequencies characterizing 


the single-electron Bloch states. Indeed,considering a pair of single- 
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particle states with distinct magnitudes of the momenta satisfying the 
above constraint, and then working out the scattering amplitude of 
these two to final states compatible with the conservation of the total 
momentum (P = fik + fk’), one finds that the available phase space for 
the final states has a sharp maximum when P = 0, as compared with 


non-zero values of P. 


Since, as mentioned above, the sought-for bound state has to satisfy the 
Schrédinger equation for the Hamiltonian (7-125) corresponding to the 


eigenvalue, say, E, we have to have 


H\b) = Ely), (7-126a) 
which translates to 
a ee Ok! 
a Ie, Fe (7-126b) 


(check this out; make use of the anti-commutation relations satisfied 
by the fermionic annihilation and creation operators, as in (7-9b); also 
make use of the fact that states of the form yi! 5, \|/) for distinct values 
of k are linearly independent). In the above equation, g(> 0) stands for 
the strength of the effective electron-electron attractive interaction, whose 
magnitude has been left unspecified. It is now easy to solve for the energy 


eigenvalue F for any given value of g since, on summing both sides over 
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k and canceling the common factor 5°), a,x, one obtains 


- = 7 (2€(k) — BY. (7-127) 


We now go over to the thermodynamic limit and replace the summation 
over k by an integral over the energy « over the range ep to eg +hwp, witha 
factor v(c) coming in, which stands for the density of states for any given 
spin orientation (i.e., the density of the wave vectors k along the energy 
scale) at energy «. Noting that the energy integration extends over only 
a small range over ef, one can replace v(¢) with vp, the density of states 
per unit volume for any given spin orientation at the Fermi surface. One 


thereby obtains 


1 Up 2€p + 2hwp —E 
g 2 7 2 -E ( ed) 
where (compare with (7-92)) 
ee 
2 F2 
Vp = : 7-128b 
P= ops ( ) 


Formula (7-128a) gives the required solution for the eigenvalue F, which 


we write in the form 


2h 
a emma (7-128c) 


evr — 1 


This tells us that E < 2E, regardless of the strength of the attractive 
electron-electron interaction. In other words, two electrons imagined to 


have energies above the Fermi level with momenta and spins as described 
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above combine to form a bound state, i.e., one with energy less than 2c 
if their exists an attractive interaction between them. The formation of 
such a bound state is understandable in the case of a strong interaction 
(gp >> 1), in which case the binding energy (2¢; — E) is seen to be large 
in accordance with (7-128c). The interest in Cooper’s result stems from 
the observation that a bound state is formed even when gv, is arbitrarily 


small, in which case the binding energy is given by 
Qep — EB hue PF. (7-128d) 


Based on this conclusion, Bardeen, Cooper and Schrieffer laid down the 
basic theory (the BCS theory) of superconductivity by showing that, at 
T = 0, the Fermi sea itself is unstable against the formation of a macro- 
scopically large number of Cooper pairs, all of which condense into the 
same state of the form (7-125), thereby conferring novel properties on 
the metallic medium under consideration. Indeed, knowing that a pair of 
electrons added to the filled up Fermi sea forms a bound state with bind- 
ing energy greater than the total excess of their initial energies over the 
Fermi level, one can check that the same situation applies to pairs of elec- 
trons initially just below the Fermi level at T = 0. The formation of bound 
Cooper pairs out of the Fermi sea is expected to continue till it becomes 


energetically unfavorable for the Fermi sea to be depleted further. 


For an explanation of the formation of the Cooper pair in simple physical 


terms, see [120]. 
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7.6.5 Elements of the BCS theory 


In this section we follow [136] to outline the basic ideas of the BCS theory, 


starting from the construction of the BCS ground state. 


As mentioned above, a macroscopically large number of Cooper pairs can 
be formed as bound configurations of two electrons each. Thus, instead 
of the single-particle states labeled (kc), we focus on pair states labeled 
(k t,-k |). Considering pair states corresponding to various possible 


values of k, a good guess for the BCS ground state would be of the form 


Wa) = [ [eae + vrai!) 100), (7-129) 
k 


where the product is taken formally over all the Bloch vectors k making 
up an energy band, and where |¢,) stands for the vacuum state, with 
no particles present. Here u,,v, are undetermined coefficients for each 
k appearing in the product, and have the physical significance of deter- 
mining the probability of occurrence of a (k t,—k |) pair in the ground 


state. Thus, |v;,|? is the probability that a Cooper pair labeled (k +,—k |) 


exists in the state |v) and, correspondingly, |v,.|? is the probability that 


such a pair does not exist (where, necessarily |u|? + |vx|? = 1 for all k). In 
other words, the number of Cooper pairs is not a good quantum number 
for describing the BCS eigenstates of the effective Hamiltonian describing 
the formation of such pairs. The latter is chosen to be a reduced one 


obtained from (7-124a): 


H=S e(k)aj ato + Y> Viadh a! yy pat. (7-130a) 
ko kl 
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Though appropriate in the context of superconductivity, this Hamilto- 
nian omits many terms of (7-124a) describing electron-electron scatter- 
ing, since these are not relevant in respect of pairing. With this reduced 
Hamiltonian in place, one can now determine the sets of coefficients 
{ux}, {ux} determining the distribution of Cooper pairs labeled (k +, —k |) 
in the BCS ground state (7-129) among the various possible values of k, 
and then go on to determine other characteristic equilibrium and non- 
equilibrium features of the phenomenon of superconductiviy. Since one 
anticipates that the states relevant in superconductivity do not corre- 
spond to well defined numbers of electrons, it is more appropriate to 


work with the grand Hamiltonian 


K =H-pN =) (ek) — wah dro +S > Vidhya! yy @ap@ay. (7-130b) 
ko kl 

where N = )>,, Gi. Gy. is the number operator for the electrons, and ju 
stands for the chemical potential, in terms of which the mean number of 
pairs in the ground state is determined. In the thermodynamic limit, the 
number distribution of the Cooper pairs in the ground state is expected 
to be sharply peaked about the mean value $N and, correspondingly, the 
non-occupancy and occupancy probabilities |u|”, |v,|? are expected to be 


distributed sharply in k. 


Recalling that the attractive electron-electron interaction is effective for elec- 
tron pairs with their momenta differing in magnitude only to a small extent 
from the Fermi momentum (er, which is the value of the chemical poten- 


tial at T = 0), |v_|? is expected to fall sharply from unity to zero over a thin 
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energy range around yp, and correspondingly, |u|? is expected to rise from 
zero to unity. This is borne out by the results stated below on the BCS 
ground state, based on the assumption that the interaction is effective only 


for electrons with energy « satisfying |e — er| < hwp. 


Indeed, on working out the mean number of electrons forming the pair 
states (V = (N) = (Wal 1, Gh dxelva)) and, similarly, the mean squared 


number (N? = (N ?\), with |v) given by (7-129) (refer to [136]) one obtains 


N=2S0|oxl?, N?—N? =45° len? loxl?, (7-131) 
k k 


these results being compatible with the physical significance of |ux|?, |ux|? 


as the non-occupancy and occupancy probabilities of pair states labeled 


(k t+, -k |) for the various possible values of k. Since, as mentioned above, 


2 


> 


vx|” & 0 for most of the k values excepting values corresponding to 


| 
e(k) within a narrow window around 1, it follows that the relative spread 


AN is indeed small. 


NIF 


Based on results stated below, the relative spread 5% works out to (7) 


which, in a typical situation, is ~ 10~'’. 


Having had a preliminary idea as to what the BCS ground state looks like, 
more concrete results can be derived by either a variational method or 
by a canonical transformation (the Bogoliubov transformation mentioned 
below) from the operators a, a/,, to a new set of annihilation and creation 
operators in terms of which the Hamiltonian appears in a diagonal form. 


This leads one to the excitation spectrum in the BCS model. 
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7.6.5.1 BCS ground state: the variational method 


Briefly stated, the variational method aims at making an appropriate 
guess (from among |Wc(@))) for the ground state |Wc) one is looking for, 
where @ stands for one or more parameters that determine a family of 
states from which the actual ground state |Wc) (more precisely, an ap- 
proximation to it) is determined by minimizing the expectation value (w¢(0)|K |Wcq(0)) 


with respect to the parameter(s) 0. This is expressed as 


d(be(0)|K|ve(8)) = 0, (7-132) 


where 5 denotes the first variation of (wW¢(0)|K|Wc(0)) with respect to 0, 
while the positive definiteness of the second variation also needs to be 
ensured. The requirement (7-132), considered at T = 0 (i.e., ww = ep) 
gives the BCS ground state, where one needs to consider K instead of 
H because of non-conservation of particle number by the effective BCS 


Hamiltonian (7-130a). 


In the present instance, one chooses |w¢(@)) in the form (7-129), in which 


the coefficients u,, 1, are taken to be of the form 


Uk = SiN Ox, UK = COS Ox, (7-133) 


so that 6 actually stands for the set of parameters {6}. This choice of the 


parameters automatically satisfies the constraints |u,|? + |v,|? = 1. 


We have chosen ux, vu, to be real for all k, but a more general choice would 


give essentially the same results; however, there remains the choice of a 
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constant phase factor e’® whereby v, can be replaced with e’?v,, for which, 


see below. 


It turns out that the minimization is achieved with 6, satisfying 


tan 20, = 25 vee (7-134a) 
&k 
where 
& = e(k) — p. (7-134b) 


This determines the BCS ground state in terms of the set of parameters 
{6,} determined by (7-134a). As regards the interpretation of this result, 


we define 


1 . 
Ax = => » VYauyvy = 5 » Vag sin 20) 
Ey = \/A24+ &. (7-135) 


As will be seen from the diagonalization of the BCS Hamiltonian (sec. 7.6.5.2) 
by employing the Bogoliubov transformation, FE, will represent the excita- 
tion energy of Cooper pairs characterized by Bloch vectors k. The mini- 
mum excitation energy, referred to as the BCS energy gap, corresponds 
to e(k) = ep, and is given by A,,, = A (say) (indeed, A, = A for k close 
to kp, see below). The energy gap A is actually a function of tempera- 
ture (our considerations so long have been restricted to T = 0) and can 
be interpreted as the order parameter for superconductivity: it increases 


from zero at T, (the transition temperature) and reaches a maximum at 
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T = 0. More generally, the order parameter is a complex number, of the 
form e’?A where ¢ is a constant phase, the physical characteristics relat- 
ing to superconductivity are independent of ¢, and this gauge symmetry 
is broken in any particular instance of the occurrence of superconduc- 
tivity where some particular value of ¢ is chosen out. It turns out that 
the choice of ¢ modifies the expression (7-129) by way of replacing vu, 
with e'®v,, i.e., the theory is invariant against the choice of a constant but 


arbitrary phase factor between u, and v, for all Bloch vectors k. 


Making use of the definitions (7-135), one obtains from (7-133) and (7-134a), 


Ax 5 
BE,’ “* 
k 


& 


ix 
tan 20, = Fo 2U,V_ = sin 20, = — uz = cos 26, = Fo (7-136) 
k k 


While the relative signs of sin 26, and cos 20, in the above expressions is 
fixed by the expression for tan 20,, the signs of the individual expressions 
are chosen in conformity with the physical requirement that, as & — oo 
(i.e., Ey — &), the occupancy probability should go to zero (i.e., un > 


1, Uk — 0). 


One can now substitute the second expression of (7-136) into the first 


equation of (7-135) so as to arrive at the consistency condition 


1 Ya Ay 
A, = = »s (7-137) 


7 AR + @ 


These equations possess the trivial solution A; = 0 for all k, correspond- 
ing to the filled up Fermi sea in the normal state at T = 0 (where vu, = 0 


for «(k) > w and uy, = 1 for e(k) < yu). However, there exists a non-trivial 
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solution with a lower energy for negative V,; corresponding to an attrac- 
tive electron-electron interaction. We assume, for the sake of simplicity, 
and in keeping with our earlier characterization of the phonon-mediated 


electron-electron interaction, that Viy is given by 


Via = —@ if |€x|, |G] < Awp (g > 0) 


= 0 otherwise. (7-138) 


For such a simplified model for the electron-electron interaction, where g 
stands for the strength of attraction, we obtain the following non-trivial 


solution for A, 


Ax =A for \Ex| < hwp 


=0 for |&| > Awp, (7-139) 


where A has been introduced above (see paragraph following (7-135)) 
and has the significance of representing the BCS energy gap (refer to 
sec. 7.6.5.2 below). Indeed, as mentioned above, the normal state of the 
system at T = 0, corresponding to the filled Fermi sea, corresponds to 


vents 
The self-consistency condition (7-137) now assumes the simple form 


es > = (7-140) 
g k Ex 


from which one obtains, on replacing the summation by an integration in 
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the thermodynamic limit and then evaluating the integral, 
— = sinh” + — 7-141 
sin ve ( a) 
or, in the weak coupling approximation (vpg << 1), 
A & Qhwpe 79. (7-141b) 


With this expression for the energy gap, ux, v are finally obtained as 


on bk 
m= 5+ aaa) 
1 
m= 5 (1- a z) (7-142) 


I mention once again that we have chosen ux, v, to be real. More generally, 
one can choose u, to be real while v, can have a constant phase factor 


e’®, where ¢ sets a gauge in the theory. 


A plot of vz against «(k) (recall that & = e(k)— where ps = e» at T = 0) looks 
as in 7-11, where there occurs a sharp drop from 1, ~ 1 to vu, © 0 within a 


small interval A on either side of «(k) = yw, and the fall for |&,| >> A is like 


The general nature of variation of the occupancy probability (of (k +, —k |) 
paired states) with «(k) is similar to the Fermi function at a low but finite 
temperature which gives the occupancy probability f(«(k)) of a single- 
particle (kc) state. It turns out that the vz-«(k) (characterizing the super- 


conducting ground state at T = 0) curve is very close to the f(«(k))-e(k) 
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KA Se 


Figure 7-11: Depicting the variation (schematic) of the occupancy probability vz (we 
assume vu, to be real) for the BCS ground state (the equilibrium state at T = 0) with 
displaced energy &% = «, — ss on either side of & = 0; vu, drops sharply from vu, = 1 to 
Uk © 0 over a very small interval ranging from —A to A; the variation of vz resembles 
that of the Fermi function at a finite temperature; in particular, the two variations are 
almost identical when the Fermi function is considered at T = T., the superconducting 
transition temperature. 


curve at T = T7., the superconducting transition temperature. This tells 
us that what is important in superconductivity is not any appreciable 
change in the occupancy of the single-particle states, but the phase co- 
herence of the paired states (in the ground state worked out above, all the 
paired states have the same phase that we have chosen to be ¢ = 0) that 


sets in below TJ, and results in the gap A. 


While A denotes the energy gap at J = 0 that opens up in the excitation 
spectrum of the system of electrons (refer to sec. 7.6.5.2 below), it is not 
a measure of the actual lowering of the ground state energy of the system 
compared to the energy of the Fermi sea, i.e., the ground state energy of 
the normal metal. This lowering of the ground state energy, if it happens, 
tells us that the Fermi sea is indeed destabilized at sufficiently low tem- 


peratures in virtue of the pairing interaction, and can be expressed as 
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Efrrerooavetinal — Formal — (ug 6)IKHa(6)) — 2 be (7-143a) 
k 


On working out the right hand side of the above expression one obtains 
(I again refer you to [136] for the derivation, which is straightforward) 
the following expression for the difference in the thermodynamic internal 
energies (at T = 0) between the superconducting phase and the normal 


phase, 
1 
U(0) [superconducting] __ U (0) [normal] ae 5 TN (0) z (7- 1 43b) 


which can be identified as the condensation energy (at T = 0) giving the 
extent to which the Fermi sea is destabilized by the formation of Cooper 
pairs with momenta distributed over a narrow range around the Fermi 
momentum kp, all in the same phase. Thus, by definition, equals $10 H2(0) 
(refer back to formula (7-112)), where H.(0) is the critical field strength at 
r=0; 


1. I repeat that, though the product gvp is dimensionless (as it should 
be), the factors g and vf individually scale like V~! and V in the ther- 
modynamic limit. Introducing a re-scaled coupling strength go = gV, 
one can express the binding energy of the Cooper pair and the energy 
gap A in terms of gonp (where np stands for the density of states per 
unit volume) in which neither of the two factors go and 7p scales with 


the volume V. 


2. Formula (7-143b) can be interpreted by observing that vpA number 
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of electrons from an energy slice of width A below the Fermi level all 
condense into a state at a depth A below the Fermi level, with an 


average lowering of energy by an amount $A. 


3. The lowering of the BCS ground state energy with respect to the free 
electron ground state (filled up Fermi sea) mostly occurs due to the 
pairing of electrons close to the Fermi level. The pairing of electrons 


deeper down the Fermi sea does not contribute to energy lowering. 


The symbol A(0) in (7-143b) is indicative of the fact (See below) that the 
energy gap is a temperature dependent quantity A(T), whose value at 
T = 0 has been denoted above by A. However, what we have done above 
is to define the quantity A and to work out its role in determining the 
ground state of the system of electrons (for the interpretation of A as an 


energy gap, see sec. 7.6.5.2 below). 


Summarizing, we have seen that a weak electron-electron interaction can 
result through the mediation of phonons for electrons in matched states 
of the form (k,*t),(—k,{), for wave vectors (k) close to the Fermi sphere 
(within an energy interval ~ fwp). Further, this weak interaction can 
explain the formation of a coherent superposition of such paired states, 
resulting in a state |<) of the superconductor — the BCS ground state — 
whose energy is less than the total energy of the Fermi sea by an amount 
5¥pA(0)?. In other words, the attractive weak interaction makes the Fermi 
sea unstable, causing a macroscopic quantum effect whereby, at T = 0, the 


superconductor goes over to the state |zc¢). 


We will now work out the excitation spectrum in the BCS theory where A 
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will be found to have the physical interpretation of an energy gap. 


7.6.5.2 BCS excitation spectrum: the Bogoliubov transformation 


The BCS excitation spectrum is obtained by invoking a canonical trans- 
formation that converts the grand Hamiltonian (7-130a) into a diagonal 
form. One first reduces the Hamiltonian in a manner analogous to a 
mean field approximation and then the reduced Hamiltonian is subjected 
to a linear canonical transformation (the Bogoliubov transformation, anal- 
ogous to (7-32a) invoked for bosonic operators; the same transforma- 
tion is associated with the name of Valatin as well), introducing a new 
set of Fermion annihilation and creation operators from the operators 
{axes}, {a\,,}. Following [136] I outline the diagonalization procedure by 


way of indicating a few relevant steps. 


In the first step of reduction, one starts from the BCS pairing Hamilto- 
nian (7-130b) and replaces operator products such as a,;@_,, with their 
averages (along with fluctuations of the first degree). Such operator prod- 
ucts do not conserve particle number, but we have seen that already the 
ground state is a superposition of states with varying particle numbers 
(though one in which the mean number of pairs is large) but with the 
same phase. It is this phase coherence along with the large value of 
the mean number that leads to large expectation values of operators like 
dxt+@_z,, in contrast to their expectation values in similar superpositions 


of normal states with random phases. More precisely, we write 


AktA—k| = by + (GxtG—k) = bx) (7-144a) 
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where 
by = (GxpA_ky) (7-144b) 


and where the conjugate b; is similarly defined. On substituting in (7-130b) 


and ignoring bilinear terms in the fluctuations 4,;4_,, — b,, one obtains 


KS Gitte +S Vialaj pa! sey Die + rer —rey bie — Dib]; (7-145) 

ko kl 
(check this out). The use of the grand Hamiltonian K (refer to (7-130b)) is 
now justified since the particle number conservation is explicitly violated, 
though the fluctuations in the particle number are small at low tempera- 
tures. One now has a grand Hamiltonian that is made of quadratic terms 


which can be diagonalized by means of a canonical transformation of the 


form 
dict = Uk + Veh, 
ayy = Ute + Ue (7-146) 
where the numerical coefficients ux,v, are to satisfy |ux|? + |vx|? = 1, 


which implies that the operators 7,,7, satisfy the fermion anticommu- 
tation rules (check this out). The coefficients are now to be chosen so 
that the grand Hamiltonian appears in a diagonal form. This is ensured 


by choosing 


c= , (7-147a) 
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where 


Ex = V&+|Ax?, Ax = — >] Viabh, (7-147b) 
1 


(check this out as well). Note that this choice of u,,v, coincides with 
our earlier variational result (7-142) (recall that (7-142) was obtained on 
making the simplification (7-139); more generally, (7-147a) holds in the 
variational expression for the ground state). The Grand Hamiltonian now 


reduces to 


K = So (Gc — Be + Ande) + 2 Ex (Fe + fihtic)- (7-148) 
k k 


In this expression the first term turns out to be the BCS ground state 
energy (i.e., the energy of the Fermi sea augmented by the condensa- 
tion energy (7-143b)), while the second term gives the excitation energy 
over the ground state, made up of elementary excitations (referred to as 
Bogoliubov excitations), or quasi-particles, associated with the fermionic 
operators 4x, aq, Mk, fh. A single elementary excitation of momentum hk is 
produced by the fermion number operators Alan, til Thhes and has an energy 
Ey = /&+|Ak|?. Referring to (7-146), one can evaluate A, from the sec- 
ond formula in (7-147b) which provides a consistency condition since the 
numbers b, have to conform to the definition (7-144b). Such an exercise 


gives 


Ak = — )> Vauten(l — 41 — Af), (7-149) 
1 


where the coeficients u,,v, are already known (formula (7-147a)). One 
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finds that, at T = 0, when no excitations are present, A, is precisely the 
same as that obtained in 7.6.5.1 and thus, for the simplified model of the 
electron-electron interaction assumed in sec. 7.6.4, agrees with (7-139) 


where A is given by (7-141b) in the weak coupling approximation. 


Formula (7-148) leads to the following conclusions regarding the energy 
spectrum of the system, considered at T = 0, this spectrum being only 
an effective one, obtained by a temperature-dependent reduction from 
the full many-body Hamiltonian. Looking at the excitation energy F, for 
a single quasi-particle of momentum k, its minimum value is seen to be 
A, * A for excitations close to the Fermi level (& = 0). Physically, a pair of 
quasi-particles results from the break-up of a Cooper pair, which requires 
an energy expenditure of 2A, where each of the two (weakly interacting) 
electrons close to the Fermi level corresponds to an elementary excitation 
(however, see [136] for a more precise description of the excitations and 
their energies). In other words, the first excited level of the system, next 
to the BCS ground state is at an energy gap of A from the Fermi level 


(recall that a Cooper pair has a binding energy of magnitude A). 


Fig. 7-12 gives the ‘single-particle’ energy level diagram of a supercon- 
ductor [120]. The horizontal line at the bottom gives the average energy 
of a single electron in the BCS ground state (recall, though, that the latter 
is a collective state of paired electrons). The next horizontal line higher 
up in energy corresponds to a single quasi-particle at an energy gap A, 
where we recall that a quasi-particle is an electron interacting with other 
electrons by means of the weak phonon-mediated interaction. Higher 


up in the energy scale, there is a continuum of levels corresponding to 


885 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


<— quasiparticle states 


a 


(average) 


Figure 7-12: The ‘single-particle’ energy level diagram of a superconductor; the hor- 
izontal line at the bottom depicts the average energy of a single electron in the BCS 
ground state (which, however, is a collective state of Cooper pairs); the next horizon- 
tal line higher up in energy corresponds to a single quasi-particle at an energy gap A, 
where we recall that a quasi-particle is an electron interacting with other electrons by 
means of the weak phonon-mediated interaction; further up in the energy scale, there 
is a continuum of levels corresponding to quasi-particles with varying kinetic energies.) 
quasi-particles with varying kinetic energies. I repeat that the energy gap 
A corresponds to a single quasi-particle, while the break-up of a Cooper 
pair produces two quasi-particles at a time, which requires a minimum 


energy 2A. 


The BCS theory gives the density of states (v,) of the quasi-particles (as 
compared with the density of states v», of normal electrons), plotted in 
fig. 7-13 against the energy separation (¢ — «,) from the Fermi level, where 
the gap A is seen to make its appearance. Experimentally, the gap is 
detected as an increased absorption of infra-red radiation incident on a 


superconducting material. 


Formula (7-149) shows that, generally speaking, the energies A; depend 
on temperature, implying that the gap A is a temperature dependent 
quantity too, the expression (7-141b) being its value at T = 0. We now 
turn to the temperature dependence of the state of a superconductor as 


described by the BCS theory. 
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Figure 7-13: The variation of the relative density of states (i.e., the ratio of quasi- 
particle density of states (v,) and the density of states for normal electrons (v,)) against 
energy separation above the Fermi level (schematic); the plot shows the gap A, depicting 
the minimum energy separation of excitations above the Fermi level. 


7.6.5.3 Equilibrium states at 7 > 0, and the transition temperature 


Since the operators 4x, Aq, Tks th. satisfy the Fermi anti-commutation rules, 
the occupation probability of the quasi-particle states of energy FE, for 


1 


the various values of k at temperature T = (kp3)~* is given, to a good 


approximation, by the Fermi function 


1 


ebEx + 1? (7-150) 


f (Ex) = 


where we have made use of the fact that the Hamiltonian (7-148) does not 
include interaction among the quasi-particles, i.e., the system under con- 


sideration can be looked upon as an ideal gas of these entities for which 
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the chemical potential is zero since their number is indefinite (in contrast 
to the other possible point of view describing the same system, namely as 
an assembly of electrons with an effective attractive interaction, having 
a non-zero chemical potential). Note that, since A; is positive, f(F£,) > 0 
for all k at T — 0, as it should since quasi-particle states are not excited 
at T = 0. Taking into account the two pairs of creation and annihilation 


operators associated with a quasi-particle state of energy F,, we have 


stun ata een — 1 
(1 = WN — Teli) = 1 — 2f (Ex) = Ep] (7-151) 
In other words, from (7-149), 
Ay BE, 
A, =—- — tanh —. 7-152 
k dVare an : ( 52) 


The non-trivial solution to this self-consistency requirement, under the 


simplified model for V,; expressed in (7-138), reads 


l _ 2 
==> Scr eae (7-153) 


which reduces to (7-140) for T — 0 (3 — oo), as expected. This is an im- 
plicit equation determining the gap A(7) as a function of the temperature 
since Ey, = \/&% + A; where A, can be substituted from (7-139), with A(T) 


replacing A. 


The superconducting transition temperature T. is determined from the 
condition A(T.) = 0, consistent with the interpretation of A as the order 


parameter (up to a phase) characterizing the transition. With A = 0, F, 
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reduces to €, and, on replacing the summation in (7-153) by an integra- 
tion in the thermodynamic limit (keeping in mind the symmetry of |&| 
about the Fermi level) where the density of states »y comes in as in the 


zero temperature case, one obtains the result 


Bchwp 


1 2 tanhz 1 
a = —_), 7-154 
VEg / x ae (Be nT.) ( = 


The integral can be evaluated in terms of Euler’s constant y = 0.577, 


yielding 
kpT, © 1.13hwpe 79, (7-155a) 
which, when read along with (7-141b), gives 
A(= A(O)) = 1. 7644eeT.. (7-155b) 


This formula relates the BCS energy gap (at T = 0) with the transition 


temperature, and is consistent with numerous experimental results. 


The dependence of the transition temperature T, on the Debye frequency 
Wp, aS in (7-155a) explains the isotope effect, in which one finds that 
T. depends on the mass of the ions constituting the lattice structure of 
the material undergoing the transition. From a fundamental point of 
view, the ionic mass enters into the picture since the attractive electron- 
electron interaction underlying the formation of the Cooper pairs is an 


effective one mediated by phonons. The dependence of 7, on the ionic 
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mass MV appears as 


Twi, (7-156) 


In other words, experimental findings on the isotope effect, conforming 
to (7-156), lend support to the idea of phonon-mediated electron-electron 


interaction that provides the basis of the BCS theory. 


One can determine the temperature dependence of the energy gap by ex- 
pressing the summation in (7-153) in the form of an integral in the ther- 
modynamic limit where the density of states yy again makes its appear- 


ance, and one ends up with 


two tanh[2./é2 + A(T)? 
— =| gee (7-157) 
Veg 0 £24 A(T)? 

On evaluating near T = 0, one obtains A(T’) = A, the rate of variation with 
T being negligible. On the other hand, close to T., A(T’) drops sharply to 
zero, as in fig. 7-14 where one can write, to a good degree of approxima- 


tion, 


NIFH 


A(T) © 1.74A(1 — (7-158) 


) 


le 


This is the characteristic variation of the order parameter in any mean 


field theory. 


Once the temperature dependent energy gap A(T) is fixed, the excitation 
energies (Fy,(T) = \/& + Ax(T)?) of the quasi-particles get determined, and 


the Fermi distribution function (7-150) with the temperature dependent 


890 


CHAPTER 7. INTERACTING SYSTEMS III: BOSONS AND FERMIONS 


A(T) 


oe 


0 T 


I, 


Figure 7-14: Depicting the variation of the BCS energy gap A(T) with T (schematic); 
close to T = 0, the gap A, given in (7-141b), remains almost constant as the temperature 
is made to vary while, close to the transition temperature T., the gap drops sharply to 
zero as in (7-158), which is typical of a mean field theory. 


excitation energies can be made use of in working out the thermodynamic 


functions of the superconductor. In particular, one can compute the en- 


tropy 


S = —2ke \ Lf (Bx) In f(Ex) + (1 — f(Ex)) n(1 — f(B))], (7-159a) 


k 


and the electronic contribution to the specific heat of a superconductor 


CS 5 (7-159b) 


Alternatively, and equivalently, one can start from the grand partition func- 
tion Z, = Tr e-®® and the grand potential 9 = —$-!In Z, and then obtain 


the thermodynamic functions by working out the relevant derivatives of the 
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grand potential. 
The variation of the electronic specific heat of a superconductor (Coe 
below the transition temperature T’, is plotted in fig. 7-15 and compared 
with the normal electronic specific heat (C\""") extrapolated from val- 


[normal] 


ues above T., where the latter varies linearly with temperature (C\, = 


2 


20 ypk2T). The electronic specific heat suffers a discontinuity by dCy 
1.43C02°™*", In the superconducting phase, the specific heat goes to zero 


exponentially fast as a function of T as T > 0, 
Cees = eT (7-160) 


In other words, the temperature dependence of the electronic specific 
heat at temperatures T ~ 0 is a direct consequence of the existence of the 


energy gap A ([120], chapter 9). 


In an analogous manner, one can work out the electronic contribution 
to the internal energy and then to the free energy of the superconductor, 
the variation of which with temperature is shown in fig. 7-16 (the lower 
curve). The upper curve depicts the free energy for the normal state, from 
which the free energy of the superconducting state differs by the magnetic 
energy corresponding to the critical magnetic field required to quench the 


superconductivity. 


At any given temperature less than T., the free energy difference between 
the normal and the superconducting states (per unit volume of the super- 


conductor) equals the critical value of the magnetic energy density that is 
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T 


Cc 


Figure 7-15: Depicting schematically the variation of the electronic specific heat of 
a superconductor (solid line) with temperature; above the superconducting transition 
temperature T., the variation is linear, typical of the electronic specific heat in the nor- 
mal state; below T., the electronic specific heat of the superconductor differs from the 
linear extrapolation (sloping dotted line) of the specific heat variation above T(; the dis- 
continuity at the transition temperature is 1.43 times the normal specific heat; based 
on figure 3.3(b) of [136]. At T — 0, the specific heat vanishes exponentially fast. 


released in expelling the field from inside the material, which constitutes 
the basis for eq. (7-112). At T = 0, this equals the difference in internal 


energies (per unit volume) of the two phases, and is given by }10H-.(0)?. 


Referring to the formula (7-112), one observes that accurate measure- 
ments of the critical field strength are of great use in determining the 
thermodynamic parameters of a superconductor and checking consis- 


tency of formulae derived from the BCS theory. 
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F 


Figure 7-16: Depicting schematically the variation of the electronic contribution the 
free energy of a superconductor (the lower curve) with temperature; the upper curve 
depicts the free energy for the normal state, from which the free energy of the super- 
conducting state differs by the magnetic energy corresponding to the critical magnetic 
field required to quench the superconductivity (refer to formula (7-112)); based on figure 
3.3(d) of [136]. 


7.6.5.4 Zero resistance 


The BCS theory provides a microscopic explanation for the infinite con- 
ductivity of a superconductor — one of the two most salient features of 
the latter, the other being the Meissner effect considered in sec. 7.6.5.6 


below. 


When a current flows through a conductor, there occur continual scat- 
tering events experienced by the current carriers, resulting in the dis- 
sipation of energy and a decay of the current unless it is sustained by 
the operation of a driving EMF, the latter being proportional to the re- 


sistance of the conductor. In the case of a superconductor, a current 
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can flow even without the application of a driving EMF. Such supercur- 
rents are initiated under the condition of changing magnetic field around 
the material but continue to flow undiminished and serve the purpose of 
permanently maintaining the interior of the superconductor in a state of 
zero magnetic flux (Meissner effect). This feature of a constant supercur- 
rent flowing without the application of a driving EMF is indicative of zero 


resistance — a condition that prevails for temperatures T < T.. 


The explanation of zero resistance is based on the energy gap A (we con- 
sider T = 0 for the sake of concreteness) and the binding energy of a 


Cooper pair, amounting to 2A. 


The momentum distribution of the assembly of Cooper pairs in the BCS 
ground state is perfectly isotropic, implying a zero current in the material. 
However, nothing prevents the increase of momenta (say p and —p, with 
opposite spins) of both the electrons in a super-pair by a given amount, 
say, 3P so that the momenta are now p+ 4P,—p+3P. Imagining the 
same increase of momentum (by P) for all the Cooper pairs, we arrive at 
a state completely equivalent to the BCS ground state we started from, 
with only an added phase factor ex?® for each pair (where R. stands for 
the position vector of the center of mass of the pair), which implies that 
the ground state wave function also acquires a phase and the coherence 
between all the pairs remains intact ([68], [120]). This state is the same 
as the initial BCS ground state as seen from a moving frame, and is 
characterized by the same gap A, and binding energy 2A of each pair. 
The total momentum, which was zero in the initial ground state, is P per 


pair in the phase-shifted ground state, and remains constant as in the 
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former, implying that there can occur no decay in the current. Indeed, the 
momentum cannot change by inelastic collisions suffered by a pair since 
such a collision has to provide for a change of energy at least amounting 


to 2A. 


The pairs are also immune to momentum change resulting from elastic colli- 
sions (such collisions can change the direction of the momentum, resulting 
in an alteration in the current) since a change in the current would cause 
a corresponding change in the magnetic flux linked with the superconduc- 
tor, where the latter has to be a multiple of an elementary ‘flux quantum’ 
(refer to [68]), in keeping with the phenomenon of flux quantization (see 
sec. 7.6.5.7); such modification of the flux by a precisely defined amount 
being highly improbable, a current-carrying superconductor is necessarily 


in a stable state. 


7.6.5.5 Supercurrents and critical magnetic field 


As mentioned above, a current flows along the surface of a superconduc- 
tor, maintaining the interior of the latter in a flux-free state (refer back 
to 7.6.2.3). As we see below, the magnitude of the current density asso- 
ciated with this supercurrent (to be denoted by j,) has to be necessarily 
less than a certain maximum value (the critical supercurrent density, to 
be denoted by j,.) if the Cooper pairs are to remain intact without break- 
ing up. It is this maximum current density that determines the critical 
magnetic field strength (H,) at the surface of the superconductor which, 
when exceeded, causes the superconductivity to disappear, along with 


the associated features of zero resistance and the Meissner effect. 
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A current flux is associated with an increase in the momentum of either 
of the two electrons of a Cooper pair from, say, +p to +p +P. The 


corresponding increase in energy is 


6E = —(+p+ =P)? — —p? = 47 + —. (7-16 1a) 


For situations of interest (when |dE| < A, see below) one can ignore the 
quadratic term in P, in which case the maximum possible increase in 
energy (for p ~ pp, since one needs to consider only electrons close to the 


Fermi level) is 


_ PrP 


6b & : (7-161b) 
2m 


If the energy (25£) imparted to a Cooper pair exceeds the binding energy 
2A, then the pair breaks up and the material under consideration ceases 


to be a superconductor, the condition for which is thus seen to be 
—— w 2A. (7-162) 


Corresponding to an increase of momentum P of a typical Cooper pair, 


the supercurrent density j, is given by 


nseP 


je= , (7-163) 


2m 


where n, stands for the number density of the electrons constituting the 
Cooper pairs. Using the value of P given by (7-162), one obtains the 


maximum possible value of the supercurrent density, beyond which the 
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property of superconductivity disappears, 


nseA 
PF , 


= (7-164) 
Corresponding to this critical value of the supercurrent density, one can 
work out the critical magnetic field at the surface of a superconductor 
that causes a breakdown of superconductivity. We imagine a long thick 
cylinder of radius R, with the supercurrent of density j,, flowing on its 
surface in a direction perpendicular to the axis. The magnetic field H, at 
the surface parallel to the axis of the cylinder is obtained from Maxwell’s 


equation as 
27RA, = f ised. (7-165a) 


The supercurrent is actually confined to a thin layer of width A; given 


by (7-116c) at the surface of the cylinder, which implies 


2hihthe = 27 AN jac 


ie., He = Avjso- (7-165b) 


On making use of (7-164) we obtain, with reference to the BCS theory, 
the critical field strength beyond which the material under consideration 


ceases to be a superconductor, 
nseA 


A. = Ay ; (7-166) 
PF 


The same result can be obtained by equating the expression for the con- 
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densation energy (formula (7-143b)) with the magnetic energy $HoV H?, 
that would have been there if the superconductivity were to be made to 


disappear. 


While we have deduced the above results at T = 0 for the sake of concrete- 
ness, analogous results could be obtained at any temperature T < T. by 
making use of the temperature dependence of the energy gap A, given 
in (7-158). Thus, according to the BCS theory, the temperature depen- 


dence of the critical field strength is given by 


)2, (7-167) 


7.6.5.6 BCS theory and the Meissner effect 


The BCS theory accounts for the Meissner effect in a superconductor, in 
which the bulk of the material remains free of magnetic flux as long as the 
temperature is kept less than the transition temperature T.. As indicated 
in sec. 7.6.2.3, the Meissner effect can be seen to follow from the second 
London equation stated in (7-113a) which along with the Maxwell equa- 
tion (7-114), implies the first (and also the second) equality in (7-115). 
The latter implies that the bulk of the superconductor remains free of 
magnetic field since there occurs an exponential decay of the field from 


the surface inward. 


Thus, a derivation of the second London equation from the BCS theory 
serves the purpose of establishing the Meissner effect from microscopic 
considerations. We present here a brief outline of this derivation, follow- 


ing [68]. 
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The basic idea underlying the derivation relates to the high degree of 
coherence among the Cooper pairs. Since a Cooper pair has a binding 
energy of 2A, this can be taken as a measure of its energy uncertainty, 
from which one can work out the momentum uncertainty and then the 


spatial extension of the pair, which turns out to be of the order of 
hpp 


Els PG (7-168) 


The same estimate for the spatial extension of a Cooper pair is obtained 
from a consideration of the lattice deformation brought about in the wake 
of an electron that causes a second electron to be caught in an attractive 
interaction. The order of magnitude of the volume ‘occupied’ by a Cooper 
pair works out to ~ 10~'cm?. Within this rather large volume, something 
like 10°-10’ other Cooper pairs can be accommodated, which indicates a 


high degree of coherence among the pairs. 


The coherence length € referred to in sec. 7.6.2.4 compares with é’, the typ- 
ical ‘size’ of a Cooper pair, as € > é’, since it measures the distance over 
which the density of Cooper pairs builds up from zero value (corresponding 
to a normal conducting region) to the value corresponding to the super- 
conducting phase, it being apparent on physical grounds that this distance 


cannot be smaller than the spatial extension of a pair. 


We recall that the BCS ground state wave function (7-129) is isotropic and 
corresponds to momentum zero of each Cooper pair. If we now consider a 
state in which each pair carries a momentum P, with each member of the 


pair carrying an additional momentum $P then, as a consequence of the 
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coherence mentioned above, the net effect on the wave function appears 


in the form of a phase factor as 


|W) =e |W). (7-169a) 


with 


1 
(Ri, Ro,--: Ri, ) = FP (Ri Ro greet R, ve), (7-169b) 


where the on the right hand side is over the center of mass co-ordinates 
R,, Ro,--- ,R,,--- of all the Cooper pairs (we choose a basis in which all 


the corresponding operators are diagonal). 


We now work out the supercurrent in the presence of a magnetic field, 
represented by the vector potential A. The expression for the particle 
current density due to a Cooper pair in a state |W) in the presence of the 


field is 


1 h 
j=— ~ 2eA ; 7-17 
i= FWY + 2€A)ly) +e, (7-170) 
(we note that the mass of a Cooper pair is 2m and the charge is —2e, 
the mass and magnitude of charge of an electron being m and e; ‘c.c 
denotes complex conjugate; V stands for the differentiation operator with 


respect to the center of mass co-ordinate of the pair, which is assumed 


to be diagonal in the representation chosen). The supercurrent for the 
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assembly of all the Cooper pairs in the superconductor is thus 


2 h 
js = —-— Y[U|(-V_ + 2eA)|¥) + ec], (7-171) 

4m a 
where V, stands for the derivative operator with respect to the center 
of mass co-ordinate of a Cooper pair carrying the label v (the factor —2e 
appears since j, is a charge current density while j represents a particle 


current density). 


1. The expression for |W) in (7-169a) is to include an antisymmetrization 
operator so as to produce a state that is antisymmetric with respect 
to exchange of one-particle states. However, since the BCS ground 
state |Wc) is already antisymmetrized, the antisymmetrization opera- 
tion in the present context reduces to a simple summation, and can 


be disregarded [68]. 


2. Strictly speaking, the expression for the supercurrent is to depend on 
the co-ordinates of each of the two electrons in a Cooper pair. However, 
the strong coherence among the pairs and the large size of each pair 
implies that it is sufficient to consider only the dependence on the 
center of mass co-ordinates of the pair. This approximation works well 
for a type I superconductor in which there is no sharp field variation 


within its bulk. 


Making use of the expression for the phase ¢ given in (7-169b), the su- 
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percurrent given by (7-171) may be seen to work out to 
: e 
Js = ~ Dyas [4eA (We|va) ou 2h(Walva) PS Vv¢(Ra, Ro, rT pity, aie I, (7-172a) 
We Vv 


where |wWc) is to be normalized so as to give the total density of all the 


carriers of the supercurrent, i.e., 


(halve) = i (7-172b) 


n, being the number density of the electrons making up the Cooper pairs. 
Noting that V,¢ = ;P for all v and that cur/P = 0 (homogeneity), we finally 
have, taking the curl of both sides of (7-172a), 


. nse? n,e2 
curlj, = ———curlA = — 
m 


B, (7-172c) 
m 


which, in virtue of (7-113b) is precisely the second of the two London 
equations given in (7-113a). As stated earlier in this section (see also 
sec. 7.6.2.3) this implies the Meissner effect, according to which the bulk 
of a superconductor remains free of magnetic flux. The present derivation 
is based on the assumption that the Cooper pair density does not have 
a pronounced spatial variation, being almost constant over extension of 
a Cooper pair. This, precisely is the condition under which the London 


equation itself is valid. 


7.6.5.7 Flux quantization 


We continue to consider the supercurrent, i.e., the current carried by 


Cooper pairs in a superconductor, in the presence of a magnetic field 
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B=curlA. 


The source of the magnetic field tends to set up a field in the bulk of the su- 
perconductor, which is annulled by the supercurrent flowing on its surface 


so that the magnetic field in the bulk remains zero. 


By (7-172a), (7-172b), the supercurrent is obtained as 


20. 
(— 


; ehng 
js = A+ >=) Vi¢(Ri, Ro, ,R,,--+)]. (7-173) 


m 


For the sake of concreteness, we consider a closed ring made of the super- 
conductor, with the field B in a direction perpendicular to its equatorial 
plane as in fig. 7-17, and evaluate the line integral of both sides of (7-173) 


along a circular line L chosen to encircle the hollow of the ring. This gives 


2 h 
i d=-—2 gad SY $b v,6-d (7-174) 


In the case of a stationary state, the change of the phase ¢ around the 


closed loop will be an integral multiple of 27. One therefore has, 
m . h 
ap i d+ p B-ds = NE (iY =U, 4142). (7-175) 
Nse 2e 


In the special case of a closed path in a simply connected region within 
the bulk of a superconductor, one will have N = 0, which implies the sec- 
ond London equation given in (7-113a). The expression on the left hand 
side of (7-175) is referred to as the fluxoid, which is thus constrained to 


be an integral multiple of 27. In other words the flux linked with the ring 
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B 


_- surface of ring 


Figure 7-17: Explaining the principle of flux quantization; a ring made of supercon- 
ducting material encloses magnetic flux, with flux lines perpendicular to the equatorial 
plane of the ring; the interior of the material within the ring is made flux-free by surface 
currents circulating within a thin layer (thickness ~ A,); Lis a closed contour encircling 


the hollow of the ring; the expression on the left hand side of (7-175) integrated over L 


is referred to as the fluxoid associated with it, and is quantized in units of ®) = #; in 


the case that L lies within the bulk of the material, this implies the flux quantization 
condition (7-176). 


and the supercurrent density are related to each other by a quantization 
rule. If the closed path L is chosen to lie deep within the material of the 
superconductor away from the surface of the ring, the supercurrent j, 
reduces to zero (recall that the supercurrent flows only in a thin layer on 
the surface, of thickness A,, the London penetration depth). This implies 
that the flux $B -dS with the ring (or, more generally, with a surface 
bounded by a closed curve in the bulk of a superconductor) is quantized 


as 
h 
$ B-dS = NO, bo = 5 (7-176) 
€ 


where the flux quantum ©, has the value ~ 2.07 x 10-!°Wb. 
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The flux quantization rule derived above can also be arrived at by follow- 
ing an argument along the line of the one leading to the Bohr-Sommerfeld 
quantization principle. The result (7-176) arrived at along this route indi- 
cates that the BCS state is indeed one based on strong coherence among 
the current carriers and that the latter have a charge of magnitude 2e 


each. 


The flux quantization rule is a consequence of quantum mechanical co- 
herence on a macroscopic scale, which is the single most basic feature 
characterizing the phenomenon of superconductivity, and has received 


ample experimental verification. 


The principle of flux quantization determines the distribution of magnetic 
field strength within the bulk of a type II superconductor where flux tubes 
carrying one quantum of magnetic flux are separated from one another 
by flux-free superconducting regions (refer to sec. 7.6.2.4). In such ma- 
terials, it is the interplay between interface energy, magnetic energy, and 
condensation energy that determines the regular lattice-like structure f 


the flux tubes. 


7.6.6 Superconductivity: the Ginzburg-Landau theory 


The Ginzburg-Landau theory (also referred to as the Landau-Ginzburg 
theroy, abbreviated in this book as the LG theroy) is an enormously 
successful theory of superconductivity and is, in a sense, a theory at 
an intermediate level between the phenomenological London theory and 


the microscopic BCS theory. It is an improvement over the London the- 
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ory in that it is formulated in keeping with quantum mechanical princi- 
ples while correctly accommodating both strong and weak magnetic fields 
within its framework and, at the same time, provides a greatly versatile 
account of a wide variety of phenomena relating to superconductivity. In 
particular, it successfully addresses problems (such as those relating to 
thin films or to edge effects) involving inhomogeneities in the concentra- 
tion of current carriers that the microscopic BCS theory finds difficult to 


deal with. 


The theory starts with a putative complex wave function V describing 
the macroscopic state of a superconductor, whose squared modulus (||?) 
gives the density of the supercurrent carriers. It is this macroscopic wave 
function that is taken as the order parameter, in terms of which one can 
write down the free energy functional as in sec. 6.3.2, with the differ- 
ence that the order parameter is a complex scalar in the present context 
and the ‘kinetic energy’ term (the fourth term on the right hand side 
of (6-102c)) is to be written in terms of the expectation value of the ap- 
propriate quantum mechanical operator in the presence of a magnetic 


field. 


The free energy density functional appears in the form [136] 


1 oA 
sag l(cV — gA)UP, (7-177) 


1 1 
f = foo + 5HoH” + a ¥)? | sul | 


where a,y are phenomenological constants, H stands for the magnetic 
field strength, and A for the vector potential of the field that may be 


made to act on the superconductor. The constants M and q stand for the 
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mass and charge of the current carriers which are to be replaced with 
2m and —2e respectively, where m,e stand for the mass and magnitude 
of charge of the electron. In the above expression, f,. stands for the free 
energy density of normal metallic phase of the superconductor for which 


the electronic specific heat at low temperatures is « T. 


The above values of M,q turn out to be necessary when the consequences 
from the LG theory of superconductivity are compared with experimental 
findings, and establish the link between the phenomenological theory and 
the microscopic one, i.e., the BCS theory. 


In the normal metallic phase, M is to be replaced with the effective mass of 


the electron, which may differ considerably from the electronic mass m. 


Finally, one interprets the complex wave function WV by relating its mod- 
ulus squared to the density of current carriers in the superconducting 


phase, 


|? = ne, (7-178) 


where, according to the BCS theory, n, is half the density of the electrons 


involved in the formation of the Cooper pairs. 


With the expression for the free energy density in place, one can now work 
out the consequences in respect of the normal-to-superconductor tran- 
sition at T = T,. With an appropriate interpretation of the phenomeno- 
logical constants appearing in (7-177), the LG theory turns out to be 


successful in explaining a large number of observed features of type II 
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superconductors in addition to those of the type I variety. The basic idea 
is to minimize the integrated free energy density by functional variation 
of Y and A, subject to appropriate boundary conditions. As for the phe- 


nomenological constants a, 7, one requires 


y>0,a@=a(7 -1) (r¥= a> 0), (7-179) 


jee 


(refer back to sections 6.3.2, 6.3.3) if the theory is to correctly describe the 
phase transition at T = T.,H = 0. Once these requirements are imposed, 
more general and varied conditions can be considered to test the theory 


and to make use of its predictions. 


One notes, first of all, that in the absence of fields and gradients, 
1 
f= foo tale’ + s71¥t’. (7-180a) 


For T < T., the left hand side is f = f,, the free energy density in the 
superconducting phase, which satisfies (7-112) (note the slight difference 


in notation). The minimization of the free energy gives 


—= oH? = = (7-180b) 


In view of (7-179), this gives a linear variation of H,.(T) for T less than and 


= T, (compare with (7-111)). 


If we now consider the superconductor under the influence of a magnetic 


field and in an inhomogeneous state, we write UV = e’*|| and express the 
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last term of (7-177) as 


f= a7 (WIV)? + (VO — gA)?19P]. (7-181) 


On carrying out the minimization, the first term gives the energy associ- 
ated with inhomogeneities of the magnitude of the order parameter, as in 
the case of a region separating a normal from a superconducting phase, 
while the second term gives the energy associated with a supercurrent in 
a gauge-invariant form (refer back to paragraph following (7-119b)). On 
choosing an appropriate gauge (A — A — “V¢), the supercurrent energy 


is obtained in the form 


i] 
oe = =——T.g A. 7-182 
rk [LG] = 5 uM" q ( ) 
In this expression n, stands for the density of the supercurrent carriers 
and is in the nature of a phenomenological constant analogous to WM and 


q (the mass and charge of a supercurrent carrier). 


We now compare this with the corresponding expression obtained in the 
London theory. If v denotes the mean energy of an electron then the 


energy associated with a current 


jJ = nsev, (7-183a) 
is given by 
Wi 1 2 
[London] — Teles . (7-183b) 
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In these expressions, m,e stand for the mass and charge of an electron 
and n, for the electron number density. Combining these with the London 


equation 


A, (7-183c) 


(this, taken with (7-113b), reduces to the second relation in (7-113a)) one 


arrives at 
1 
. = —n,e?m?. (7-183d) 
m 


On comparing (7-183d) with (7-182), one finds that the LG expression 
for the energy density is identical with the London formula under the 
identification 


1 
g=2Ze, M=27,1, = Me 


all these being consistent with the microscopic BCS theory. In other 
words,the LG theory incorporates features of the London theory and the 
BCS theory. As a corollary, the LG theory explains in its own terms the 


London penetration depth given in (7-116c). 


Turning back to the LG free energy functional, the minimization of with 
respect to VW and A gives rise to the following Ginzburg-Landau differential 


equations: 


1 A 2 
—(-V—gA) WV U v/v = 7-184 
awa qd ) + a +7| | 0, ( 84a) 
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js = curl H = wvw ~vvy*)?— © sje, (7-184b) 
where the last equation is precisely the quantum mechanical expression 
for the current pertaining to mass M and charge q in a state described 
by the wave function VW. In order to relate to a physical situation, one 
has to take into account the relevant boundary conditions and specify 
the values of the phenomenological constants M, q. In addition, one has 


to make the physical identification ||? = n,, and relate n, to the density 


of electrons carrying the supercurrent j,. 


As an instance of the boundary condition to be satisfied by ~ and A, one 
can refer to a situation where the current at the surface flows only in the 


tangential direction , in which case one has 


(GV -A)¥], =6, (7-185) 


a 


where the suffix | denotes the perpendicular component at the boundary 


surface (interface with an insulating medium). 


The LG theory involves two characteristic length parameters of funda- 
mental relevance. One of these, the London penetration depth has al- 
ready been mentioned above. The other characteristic length, referred to 


as the LG coherence length, has been mentioned in sec. 7.6.2.4, and is 
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implied by eq. (7-184a). For a temperature T < T., we define 
y=,/—vW, (7-186a) 


when we get [79] 


h? 
( 2Ma 


)Gv 4 TA)’ ty —|bl?b = 0. (7-186b) 


h 
V-2Ma 


constitutes the LG coherence length ¢. The ratio « defined in (7-118) de- 


In this formula, the expression ( ) has the dimension of length and 
termines the wall energy of domains between normal and superconduct- 


ing regions and distinguishes between type I and type II superconductors. 


In this book we will not enter into applications of the LG theory to the 
explanation of diverse phenomena in type I and type II superconductors. 
We will close this section and the present chapter with a few remarks 


comparing the LG theory with the London theory [120]. 


The solution to the LG differential equations arising from the minimiza- 
tion of the free energy functional are easily found in problems involving 
weak magnetic fields and agree with those obtained in the London theory. 
In the case of a strong field the solutions are mostly to be obtained nu- 
merically. For an infinitely extended superconducting plate with the field 


applied parallel to the surface, |W|? is found to be constant in the bulk of 


the material, decreasing towards the surface by an amount depending on 


the field strength, thereby implying a field dependent penetration depth. 


The other major area in which the LG theory leads to novel results is 
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in the case of thin films, where the penetration depth is a function of 
the thickness. The critical magnetic field, obtained by equating the mag- 
netic energy with the difference in free energies of the normal and the 
superconducting phases, can be worked out in the form of approximate 
formulas for the cases a >> A and a << A. Here a stands for the thick- 
ness of the film and A for the penetration depth (as we have seen, the LG 
theory reproduces the penetration depth implied in the London theory). 
The result for a >> A agrees with what is found in the London theory. 
However, for a << A the discrepancy between the two theories widens. 
A novel finding in this case in the Ginzburg-Landau theory is that the 
phase transition for a non-zero value of the field is of the second kind, 


rather than the first (as one finds in the case of a bulk superconductor). 
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Chapter 8 


Non-equilibrium Statistical 


Mechanics 


8.1 Non-equilibrium statistical mechanics: In- 
troduction 


Equilibrium statistical mechanics is based on a few well-established and 
well-tested principles. One can go even further and say that there is just 
one single basic principle, namely, the assumption of the probability dis- 
tribution in the microcanonical ensemble (sections 2.1.1, 2.2.1) or that in 
any of the equivalent equilibrium ensembles, either classical or quantum 
mechanical (recall Feynman’s observation, section 1.1.2). Of course, it is 
completely another matter when one tries to justify the basic principle(s) 
or to establish the condition(s) of validity and equivalence of the equilib- 
rium ensembles or, finally, to apply the principles to problems of practi- 


cal relevance, where a vast number of approximation schemes have been 
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developed in order to arrive at meaningful results. In the end, the prin- 
ciples of equilibrium statistical mechanics have had amazing success in 
explaining and interpreting an immensely extended area of observations 


and theory relating to equilibrium thermodynamics. 


Non-equilibrium thermodynamics and non-equilibrium statistical mechan- 
ics pose a different set of problems altogether, neither of these having 
well-established basic principles that can help in explaining and inter- 
preting the entire range of non-equilibrium phenomena, simply because 
the latter constitute, by definition, an infinitely vast area, with a bewil- 
dering complexity of time- and length scales, where experimental settings 


involve a correspondingly large set of controlling parameters. 


However, one can identify certain ranges of phenomena where a number 
of basic principles can be formulated that have amply vindicated them- 
selves. Notable among these is the linear regime where appropriately 
defined sets of forces and currents (or fluxes) are linearly related to each 
other. Such linear relations lie at the base of a well-explored area in the 
thermodynamics of irreversible processes and of a corresponding area in 


non-equilibrium statistical mechanics. 


In sec. 8.2 below I briefly state a number of basic principles of the ther- 
modynamics of irreversible processes (also referred to as ‘non-equilibrium 
thermodynamics’). We will then begin to look at the principles of non- 
equilibrium statistical mechanics where the linear regime will be explored 
first. This is the regime where the systems under investigation are, in 


a certain sense, close to equilibrium states. This can be looked at as 
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the traditional or classical area of non-equilibrium statistical mechanics. 
Another area of non-equilibrium statistical mechanics of relatively recent 
vintage is that of stationary states and non-equilibrium processes not nec- 
essarily close to equilibrium ones. In particular, the theoretical results 
relating to thermostatted systems and transient fluctuation relations are 
quite well developed and have stood the test of numerical simulations and 
experimental observations, including ones on nanosystems and macro- 


molecular assemblies. 


The theory of dynamical chaos at the molecular level has acquired rele- 
vance in recent decades in the context of understanding non-equilibrium 
thermodynamic processes and in developing a number of principles in 
non-equilibrium statistical mechanics, e.g., in the escape rate formal- 
ism based on chaotic scattering, and the formulation of non-equilibrium 
probability distributions. In turn, non-equilibrium thermodynamics and 
statistical mechanics have helped develop a number of fundamental con- 
cepts in the theory of dynamical chaos. I will present a brief sketch of 
these aspects of non-equilibrium statistical mechanics in the next chap- 
ter in this book which, essentially, will be a continuation of the issues in 


non-equilibrium statistical mechanics taken up in the present chapter. 


Of course, the biggest question in non-equilibrium statistical mechan- 
ics remains: why is it that macroscopic systems under specified sets of 
constraints approach the equilibrium configuration? This is the ques- 
tion that Boltzmann and Gibbs - great pioneers in the subject that they 
were - asked and provided partial answers to. Their answers were ulti- 


mately based on general principles of molecular dynamics and probability 
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theory in two different but closely related ways. In addressing this ques- 
tion (too big, really, for an adequate answer) one comes face to face with 
the even bigger question - the ultimate one in non-equilibrium statis- 
tical mechanics: is it possible to formulate the most general principles 
of non-equilibrium evolution of macroscopic systems under constraints 
more general than the ones studied in equilibrium thermodynamics and 
statistical mechanics? This, however, is too general and too vague a ques- 
tion to be addressed with any degree of precision and concreteness, and 


is best left to the realm of speculation. 


The first important progress (one of surpassing importance even in to- 
day’s context) in the application of principles of molecular dynamics to 
non-equilibrium processes was the one relating to the Boltzmann equa- 
tion. The present chapter will include a brief introduction to this devel- 


opment of enormous significance initiated by Ludwig Boltzmann. 


8.2 Non-equilibrium thermodynamics: a brief 


outline 


8.2.1 Non-equilibrium thermodynamics: introduction 


The subject of Non-equilibrium thermodynamics (also referred to as ‘ther- 
modynamics of irreversible processes’) — an extension of equilibrium ther- 
modynamics (or, simply, ‘thermodynamics’ — is of a limited scope since it 
is confined to the consideration of processes occurring close to an equilib- 


rium state of the system under consideration. The system passes through 
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a succession of non-equilibrium states, where each such state is char- 
acterized by thermodynamic parameters close to the parameters defining 
the equilibrium state (€) in question. Though limited in scope from a the- 
oretical point of view, non-equilibrium thermodynamics is nevertheless a 


discipline of great practical relevance. 


In an equilibrium configuration, the thermodynamic parameters such as 
the pressure, temperature, specific energy, or specific volume (let us de- 
note these collectively by P), remain constant in space and time since any 
deviation from homogeneity or stationarity sets up processes that tend to 
bring the system back to equilibrium. The fundamental assumption in 
non-equilibrium thermodynamics is the one of local equilibrium where it 
is assumed that any and every small volume element (to be referred to 
as a subsystem) in the system is instantaneously in some equilibrium 
state € of its own with well defined values of the thermodynamic parame- 
ters P. close to those pertaining to the equilibrium configuration € of the 


subsystem in question. 


The subsystems are assumed to be small but still of macroscopic dimen- 
sions, the interactions between them being sufficiently weak so that the 
states of instantaneous local equilibrium of the subsystems are altered 


slowly in space and time. 


The terms ‘instantaneous’ and ‘local’ are to be interpreted in the sense 
that these refer to time intervals and spatial extensions small com- 
pared to the spatial and temporal scales characterizing the changes 


that take place in the system as a whole. 
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If one imagines the interactions among the various subsystems to be 
switched off at any given instant of time then the entire system can be 
looked upon as one in a state of constrained equilibrium, the constraints 
being precisely the (imagined) ones that make the subsystems mutually 
non-interacting. The interactions that actually occur among the sub- 
systems cause the state of constrained equilibrium to get altered, to be 
succeeded by a new (and slightly different) state of instantaneous local 
equilibrium. This transition from one constrained equilibrium to another 
continues, constituting a non-equilibrium process in the system under 


consideration. 


At any given instant of time, each subsystem is characterized by some 
particular value of the entropy 65 (with the symbol 6 denoting that it is 
a small subsystem that we are referring to). The instataneous entropy 
of the system as a whole is obtained by adding up all these subsytem 
entropies (recall that the entropy is an extensive variable and can be 
added up in the case of non-interacting systems). By the second law of 
thermodynamics, this total entropy (S) increases as the system passes 


through the succession of instantaneous states of constrained equilibria. 


In other words, one assumes that processes occurring within the system 
can be classified in two distinct categories — one involving rapid changes 
within small subsystems that quickly bring these to their respective lo- 
cal equilibrium states, and the other involving relatively slow interactions 
and exchanges between the subsystems. From a mathematical point of 
view, the processes of the latter type are characterized by a set of rela- 


tively large length- and time scales and, in the case of a fluid, are referred 
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to as non-turbulent hydrodynamic processes since turbulent processes 
involve far-from-equilibrium phenomena where the segregation in terms 
of length- and time scales breaks down. In other words, irreversible pro- 
cesses close to equilibrium can be described within the framework of hy- 
drodynamics on the basis of the assumption of local equilibrium. Within 
this limited scope, one defines a set of thermodynamic driving forces (or 
affinities) and fluxes that are assumed to be related linearly with each 
other, where the linear relations involve a set of kinetic coefficients. The 
final crucial assumption of irreversible thermodynamics is that these ki- 
netic coefficients satisfy a set of reciprocity relations between themselves. 
The reciprocity relations reflect the fact that, in microscopic terms, the 
processes are reversible in nature, being amenable to a description in 
terms of Hamiltonian mechanics, and were established by Onsager from 


fluctuation theory ([15], chapter 14; see sec. 8.5 below). 


8.2.2 Mathematical formulation 


The present section is based principally on [15], chapter 14. 


As mentioned above, the term ‘non-equilibrium thermodynamics’ essen- 
tially refers, in the present context, to a theory where the principle of local 
equilibrium holds and where thermodynamic affinities and fluxes are re- 
lated linearly. This, in turn, implies that the principles of non-equilibrium 
thermodynamics are fundamentally built upon those of thermodynamics 


of equilibrium states. 


The thermodynamic characteristics of any given thermodynamic system 


are encapsulated in the fundamental relation that specifies its entropy 
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(S) as a function of a set of extensive thermodynamic state variables 
Xo, X1, X2,--:, where Xo commonly stands for the internal energy while 
the rest of the extensive variables include the volume of the system (along 
with other variables such as the mole numbers of various components in 
the case of a multi-component system, and the magnetic moment in the 


case of a magnetic system): 
S = S(Xo, X1,---). (8-1) 


For instance, in the case of a one-component non-magnetic fluid, one 
has three independent extensive variables, Xp = U,X, = V,X2 = N, ina 


notation by now familiar. 


We adopt the use of particle numbers instead of mole numbers. While 
the latter are commonly used in thermodynamics, the former are 


more conveniently employed in statistical mechanics. 


Considering a family of equilibrium states with continuously varying val- 
ues of the extensive variables, one can obtain, by partial differentiation, 
a set of intensive thermodynamic variables (7, 7,---) conjugate to the ex- 


tensive ones, 


OS 


ul amy ey (b= 051) 2.2% +); (8-2a) 


where all the extensive variables other than X;, are assumed to be held 


fixed. 
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For instance, one has + = 96 corresponding to the internal energy, and 
z= 8 corresponding to the volume. The relations (8-2a) can equivalently 


be expressed in the form 
dS = S° nd Xp. (8-2b) 
k 


In the case of a system in which irreversible processes are taking place 
one can consider small constituent subsystems as explained in sec. 8.2.1 
and define, for a typical such subsystem, its entropy 6S in terms of 
b6Xo,0X,,--:, its extensive parameters (as already explained, the symbol 
6 here indicates that we are considering a small volume element, though 
each of these volume elements are to be of macroscopic proportions), by 
means of the same function as the one on the right hand side of (8-1) is 
of the variables Xo, X,,---. The entropy of a non-equilibrium state is then 


obtained as the sum of entropies of all the subsystems: 


s= >” 6s. (8-3) 

subsystems 
Further, we define the specific values (i.e., the values per unit volume 
around any given point in the system which we take to be a fluid for the 


sake of concreteness, possibly made up of several components) of entropy 


and of the parameters Xo, X1,--- as 
6S OX, 
aoe ok = a (kK =0,1,2,---). (8-4) 


It is then possible, with the help of the above formulas, to define the spe- 


cific entropy as a function of the specific values of the extensive variables 
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&1,&2, “77 as 
s= (1, &,-°*), (8-5) 


where now the volume drops out of the list of &’s, and one can take 
f= su while £,€3,--- correspond to the remaining extensive variables. 
All the €’s and the specific entropy s now appear, in general, as functions 
of the position vector within the system under consideration, as also of 
time. Finally, the corresponding intensive variables 7, 72,---, which are 


now functions of the position vector too (and also of time), can be defined 


in a manner analogous to (8-2a), and the relation 


ds= So ned€s, (8-6) 
k 


holds, analogous to (8-2b), with the changed definitions of the variables 


concerned. 


A non-equilibrium process corresponds to the setting up of thermody- 
namic affinities and fluxes in the system that can, in general, vary from 
point to point within it, though this dependence on the position vector 
(and also of time) will be left implied in most of the formulas below. All 
the intensive and specific extensive variables, including the specific en- 
tropy s now vary in space and time, and one can define a current density 
corresponding to the variable &, as a vector j, (k = 1,2,---) whose magni- 
tude equals to the amount of the corresponding physical quantity (e.g., 
the energy, or, say, the number of moles of some particular chemical com- 


ponent in the system) that flows per unit time per unit area around any 
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given point where the area is assumed to be oriented perpendicularly to 
the direction of flow at that point, the latter being defined as the direction 


of jr. 


In the case of a conserved quantity (say, X, with local density ¢) such 
as the energy or the mole number of a chemical component (the latter 
subject to the condition that there occurs no chemical reaction in the 


system), one obtains an equation of continuity of the form 


og _ 


where j stands for the corresponding current density as defined above. 


The current j arises in virtue of the interaction among the various 
subsystems; if the interaction were absent then the parameter € would 


remain unchanged in virtue of X being a conserved quantity. 


In virtue of (8-6) expressing the fundamental thermodynamic relation in 
local terms, the entropy density satisfies an equation of continuity of a 


more general form, namely 


ds Os 


ad opt Ye (8-8) 


where the left hand side represents the entropy production per unit vol- 
ume, i.e., the rate at which entropy is produced per unit volume because 
of the occurrence of irreversible processes within the system (this will, at 


times, be denoted by o, in subsequent sections), and j, stands for the en- 
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tropy current density, the latter being given by (refer to formula (8-2b) and 


to the physical interpretation of the various current densities; see [15]) 


Js = So nde: (8-9) 
k 


On making use of the equations of continuity of the €,’s and the entropy 


density, one obtains the formula 
= 57 i (8-10a) 
or 
§= DU Ag: ie, (8-10b) 
k 


where the thermodynamic forces or affinities, A; (k = 1,2,---), are defined 


as 


For instance, the thermodynamic force responsible for energy flow is 
given by V(;), and that for the flow of a chemical component with chemi- 
cal potential y is V(—%). The second law of thermodynamics implies that 


the expression )°, A; -j, representing the entropy production is a positive 


definite one. 


Equations (8-9), (8-10b), (8-11) are the basic formulae of non-equilibrium 


thermodynamics. The application of these formulae to actual problems 
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requires that the affinities be specified in terms of the fluxes. For irre- 
versible processes occurring sufficiently close to equilibrium states, the 
two sets of quantities may be assumed to be related linearly to each other, 
where a set of kinetic coefficients make their appearance. As mentioned 
earlier, the reversibility of the equations of molecular dynamics underly- 
ing all these processes imposes a constraint on these coefficients, referred 
to as the reciprocity relations. These relations reduce greatly the possi- 
ble number of independent kinetic coefficients. Beyond the reciprocity 
relations, non-equilibrium thermodynamics does not tell us more about 
the coefficients, and these remain as phenomenological constants in the 
theory, to be determined experimentally. Further progress is made possi- 
ble in non-equilibrium statistical mechanics where these coefficients are 
related to fluctuation properties of equilibrium states characterizing the 


systems under considerations, as we will see in section 8.5. 


The affinities and current densities (Ax, j,, (& = 1,2,---)) are vector quan- 
tities. However, the three Cartesian components of each of these are 
independent of one another and these Cartesian components, consid- 
ered for the various possible values of the sub-index k and denoted by 
A,,jr (r = 1,2,---), can be taken as the quantities of interest, there be- 
ing three each of these for each physical quantity (i.e., for each specific 
value of k) such as the energy or the mole number of each of the chemical 


components. 


Though the components of thermodynamic forces, or affinities, A, are de- 
fined in correspondence with the respective components of current den- 


sities, or fluxes, j, (r = 1,2,---), each of the flux components is linearly 
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related to all the force components. In stating this linear relation we 
consider a restricted class of systems referred to as resistive ones where 
the flux components at any given instant of time depend on the force 
components at the same instant, there being no memory effect in the 
relation between the two sets of quantities. Notable exceptions to this 
assumption are provided by electrical circuits containing inductive and 
capacitive elements. Leaving aside these electrical systems (with their 
associated magnetic processes), a very large number of thermodynamic 


systems of interest can be adequately described as resistive ones. 


Confining our attention to a resistive system, the linear relation between 
the independent affinities and current densities mentioned above can be 


stated as 
i=) 1A elle"), (8-12) 


where the coefficients L,, are, in the present context, a set of phenomeno- 
logical constants, referred to as the kinetic coefficients. The reciprocity 
relations satisfied by these coefficients, postulated by Onsager on the ba- 


sis of the principle of microscopic reversibility, can now be stated as 
Lys = Ls (oS 1,2, +**). (8-13a) 


In writing the formulae (8-12), (8-13a), it has been implicitly assumed 
that there is no magnetic field imposed on the system under considera- 
tion. In the presence of a magnetic field B, the dependence of the kinetic 


coefficients on B is to be considered explicitly since the principle of mi- 
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croscopic reversibility requires that, under the operation of time reversal, 
B is to be replaced with —B. The reciprocity relations then appear in the 


form 
L,;(B) = Ls,(—B) (r,s = 1,2,---). (8-13b) 


As an example, we consider a temperature gradient and a concentration 
gradient of a certain chemical component, with chemical potential , in 
a system, generating a flux of internal energy and of the said chemical 
component. One has to take into account here two independent current 
densities — one for the internal energy (call it jy) and the other for the 
mole number (call it jy), and two corresponding affinities (respectively, 
V(=) and —V(£)). However, it is more convenient (and more usual) to 


introduce the heat current density jg in the place of jy related to js, ju. jn 


as 
Jo =Tjs =ju — pn, (8-14) 


It turns out that if one chooses —jy and jg as the independent current 
densities then the corresponding affinities work out to 4V(j) and V(z) 
respectively (check this out). The entropy production then appears in the 


form 


i Params 1 ; 


In the case of a one-dimensional flow of energy and matter (say, along 


the x-axis of a Cartesian co-ordinate system), one obtains the following 
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relations between the current densities and the affinities, 


Ldp | d 1 
in =Lu gg, + bog (p) 
1 du | 
JQ =LanG, 227 (a); (8- 16a) 


where L,, (7,5 = 1,2) are the relevant kinetic coefficients that satisfy the 


reciprocity relations 


Ly oF Lo, (8-16b) 


(we assume for simplicity that the kinetic coefficients do not depend on 
any imposed magnetic field). Specializing further to the case of the flow 
of heat and electric current in one dimension, the kinetic coefficients 
are found to be related to the electric conductivity, the thermal conduc- 
tivity, and the thermoelectric coefficients relevant to the system under 
consideration (a thermoelectric set-up involves at least two homogeneous 


conductors). 


In section 8.5 below, we indicate how the basic principles of non-equilibrium 
thermodynamics outlined above can be interpreted in terms of linear re- 
sponse theory of non-equilibrium statistical mechanics. As we will see, 
the concept of entropy production can be interpreted in terms of a set 
affinities and currents in linear response theory that can be defined in 
correspondence with the corresponding affinities and currents in non- 


equilibrium thermodynamics. 


Beyond the linear response theory, considerations in non-equilibrium 
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statistical mechanics can be extended to far-from-equilibrium steady 
states, as outlined in chapter 9, sections 9.9, 9.10. There exists 
no theoretical framework for the description of non-equilibrium pro- 
cesses of more general description (i.e., ones other than steady states) 
in the far-from-equilibrium situation though, here again, a number 
of developments of considerable relevance have appeared in the lit- 
erature, of which a few will be outlined in sec. 9.14. In the case 
of far-from-equilibrium processes, the concept of entropy production 
continues to remain as one of fundamental relevance, though the 
concept of entropy itself becomes largely devoid of significance. In- 
deed, far from equilibrium, the distinction between the microscopic 
and macroscopic descriptions loses meaning, and the coarse-graining 
over time- and length scales that results in the distinction between 
the microscopic and macroscopic dynamics within a system can no 
longer be relied upon as a valid means of physical approximation, 


turbulent motions in fluids constituting a case in point. 


For an isolated system released from a far-from-equilibrium config- 
uration, the thermodynamic mode of description becomes valid only 
after the lapse of a sufficiently long time when the system dynam- 
ics can be described in terms of the assumption of local equilibrium. 
In the case of a fluid, this corresponds to the hydrodynamic regime 
(refer to sec. 8.2.3 below). Generally speaking, the role of the micro- 
scopic processes is limited to maintaining the condition of local equi- 
librium in this near-equilibrium regime (it may be noted that these 
very microscopic process are responsible for thermodynamic entropy 


production), where one can gloss over these processes in the thermo- 
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dynamic description, while paying the price that there appear kinetic 
coefficients in the theory that are phenomenological in nature. The 
determination of these phenomenological constants in microscopic 
terms then becomes the concern of non-equilibrium statistical me- 
chanics where the linear response theory is invoked for the purpose, 


though over a limited range of validity. 


8.2.3 The equations of hydrodynamics 
8.2.3.1 The hydrodynamic regime: introduction 


The primary aim of statistical mechanics is to explain the macroscopic 
behavior of matter in microscopic terms. All the subtlety and mystery of 
statistical mechanics lies in the complementary — and,indeed, contrary 
—- nature of the microscopic and the macroscopic modes of description. 
While the microscopic description involves an enormously large num- 
ber of relevant variables, the macroscopic properties of the system are 
captured in terms of only a small number of thermodynamic variables, 


represented by appropriate phase space functions. 


For the sake of concreteness we consider a one-component fluid — the 
simplest real-life system of statistical mechanics which is already one 
of enormous complexity as soon as one attempts to describe its macro- 
scopic behavior in microscopic terms. If a mass of the fluid is isolated, 
churned vigorously, and then left to itself, it is only after a long time that 
it will go over to an equilibrium configuration As we have seen, the equi- 
librium properties of the fluid can be explained to a reasonable extent in 


terms of the equilibrium ensembles of statistical mechanics introduced 
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in chapter 2. The evolution of the state of the fluid immediately before 
the attainment of equilibrium can be described with adequate accuracy 


in terms of the hydrodynamic equations. 


The equilibrium properties are encoded in the equation of state of the 
fluid. The equation of state of a dilute gas is explained quite satisfac- 
torily in terms of the virial expansion (refer to chapter 4), taking the 
ideal gas equation as the point of departure. The equilibrium proper- 
ties of gases at higher densities and liquids are explained in a number 
of effective approximation schemes (refer to section 4.5) where one 
calculates reduced distribution functions of various orders, starting 
from the full equilibrium distribution functions. Phase transitions 
are also understood to a reasonable degree within the framework of 
equilibrium statistical mechanics, where on focuses on magnetic sys- 
tems rather than fluids, the latter being understood only in the mean 


field approximation. 


Non-equilibrium statistical mechanics attempts to look at the time de- 
pendent behavior of the fluid before the state of equilibrium is attained. 
In reality, the processes taking place within the churned fluid are too 
complex to be described and explained in adequate terms, by making use 
of only a few macroscopic variables. It is only at large times, just before 
the state of equilibrium is attained, that the behavior becomes sufficiently 
simple for an explanation to be reasonably successful. This is the regime 
where the assumption of local equilibrium holds good (the idea of local 


equilibrium is a central one and is to be met with at several places in the 
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present chapter). 


Prior to the temporal regime conforming to the assumption of local equi- 
librium, complex processes occur in the fluid spanning a number of tem- 
poral and spatial scales. Once the local equilibrium is achieved, pro- 
cesses characterized by relatively small spatial and temporal scales con- 
tinue so as to maintain the system in local equilibrium, while now the 
evolution characterized by large temporal and spatial scale can be de- 
scribed in terms of a relatively small number of local variables. In de- 
scribing this macroscopic time evolution one can gloss over the processes 
occurring over smaller scales since, in a sense, the only role of the latter 
is the maintenance of the condition of local equilibrium, which results in 


dissipation and entropy production. 


The local variables are the densities of thermodynamic variables, the lat- 
ter being ones corresponding to conserved quantities for the fluid. For an 
isolated mass of fluid these are the energy, the total number of particles, 
and the volume while, for more complex systems, a number of other vari- 
ables may be of relevance in describing its equilibrium state. Thus, for a 
multi-component fluid for which the components do not react chemically 
with one another the number of molecules of each of the components is 
to be included among the set of thermodynamic variables, while further 
variables are also to be introduced for magnetic systems, as also for su- 
perfluids and superconductors, where slowly varying order parameters 


are necessary to describe the state of the system. 


Confining our attention to a one-component fluid for the sake of simplicity 
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and concreteness, the regime of slow temporal variations over relatively 
large spatial scales is the one where the equations of hydrodynamics de- 
scribe the time evolution of the system. These equations are based on 
balance equations for the mass density, momentum density, and energy 
density in the fluid, while the occurrence of dissipative processes makes it 
necessary to include the entropy balance equation as well for a complete 


description of the thermodynamic processes in the system. 


The balance equations relate the local rates of change of the various den- 
sities to the corresponding currents, or fluxes. These are to be supple- 
mented with equations that tell us how the fluxes are related to spatial 
gradients of the local variables that act as driving forces, or affinities (re- 
fer back to sec. 8.2; see also sec. 8.5 below) for the various dissipative 
processes. A number of phenomenological constants appear in these for- 
mulae that carry all the information about the microscopic processes that 
continue to take place, so to say, in the background of the hydrodynamic 


processes. 


In the case of a duilute gas, we will see in sec. 8.3 that the hydrody- 
namic regime is the one where the assumption of molecular chaos holds 
to within a reasonable degree of accuracy. This assumption is essentially 
equivalent to a contracted description of the system or, in other words, to 
a coarse-graining, where processes of molecular collision occurring on rel- 
atively small spatial and temporal scales are glossed over while focusing 
on the hydrodynamic processes. However, microscopic processes continue 


to occur and are responsible for dissipative effects in the system, involving 
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entropy production. 


The effect of the microscopic processes glossed over in the hydrodynamic 
description, appear in the macroscopic equations precisely in the form 
of the relations between the fluxes and the affinities where a number 
of kinetic coefficients make their appearance. The entropy production 
in the system is completely described in terms of these phenomenological 
coefficients. It is the job of non-equilibrium statistical mechanics to relate 
the kinetic coefficients to features of the microscopic dynamics. Linear 
response theory (sec. 8.4) constitutes a step in the right direction in this 
endeavor by relating the transport coefficients to a set of time correlation 
functions (see sec. 8.4.11.3; see also sec. 8.6 below). Further indications 
in this direction are obtained from the theory of dynamical chaos, as 


briefly outlined in chapter 9 later in this book. 


In concluding this section we note that the densities and fluxes, de- 
fined here as thermodynamic variables, appear as appropriate averages 
when looked at from the point of view of the Boltzmann equation (the 
kinetic theory approach) or of non-equilibrium statistical mechanics. In 
the former case the averages are evaluated with reference to the single- 
particle distribution function (refer to sec. 8.3). In the general case of 
non-equilibrium statistical mechanics, on the other hand, the averages 
are over the relevant phase space distributions where, generally speak- 
ing, the rate of change of a non-equilibrium average gets related to an 


equilibrium time correlation function. 
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8.2.3.2 The hydrodynamic equations of a simple fluid: a brief out- 


line 


The content in the present section is based on [117], which includes all the 


necessary derivations. 


The balance equations are generally of the form 


Og 
+ V Je = oe, (8-17) 


where € stands for a relevant density and J, for the corresponding flux. 
The term o¢ on the right hand side represents a source density, added 
for the sake of generality. It is of particular relevance in the case of the 
entropy balance equation where, even for an isolated system, there occurs 


entropy production due to internal dissipative processes. 


We consider a mass of fluid made up of point particles with isotropic 
interaction between them. The simplest among the balance equation is 


the mass balance equation: 


—+V- (pv) =0, (8-18) 


(refer to and compare with eq. (8-63) below, in the context of the Boltz- 
mann equation for a dilute gas), where p stands the mass density and v 
for the velocity, in the center of mass frame of the fluid, of a small volume 


element (of macroscopic dimensions) located at position r at time t. 


The momentum balance equation looks more complicated since here the 
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flux, being the current density of a vector quantity (the momentum den- 
sity) is a tensor of rank two. On setting up the equation of continuity 
of the momentum density, one obtains, in the absence of an external 
volume force acting on the fluid, 


< (wv) +V-(pvv + B) =, (8-19a) 


(compare with eq. (8-64a)). Here pvv (a second rank tensor with Carte- 

sian components pvj,v; (i,j = 1,2,3)) represents the convective flux of mo- 
= 

mentum while ? stands for the pressure tensor whose Cartesian com- 


ponents are of the form 
Pig = poi + Tj;. (8-19b) 


In this expression p stands for the hydrostatic pressure (a scalar) while 
a represents the stress tensor where the latter assumes relevance in the 
case of a viscous fluid. In the presence of a volume force acting on the 


fluid, one has to add a source term on the right hand side of (8- 19a). 


We consider next the energy balance equation in a fluid, which appears 


in the form 


0 : i 
74 re) +V- (ji + Pl) =0, (8-20a) 


where e stands for the energy per unit mass, and je) jP ! for the reactive 
and dissipative components of the energy flux. The former of the two 


fluxes represents the total energy flux in an ideal fluid where no dissi- 
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pation due to thermal conductivity or viscosity takes place, and is given 
by 


[RI 


1 
je = put pt 50°). (8-20b) 


Here u represents the internal energy per unit mass in the fluid, in terms 


of which the energy per unit mass e is given by 
1 2 
e=ut 5” . (8-20c) 


The dissipative part of the energy flux (also referred to as the heat flux) 
enters into the entropy balance equation, which can be expressed in the 
form 


< (0s) + V - (psv + jPl) = Og: (8-2 1a) 


In this formula s stands for the entropy per unit mass given by the local 


equilibrium condition expressing the second law of thermodynamics, 


i 
ps = (pu +p— Ht), (8-21b) 


where 7 stands for the local temperature and yp for the chemical poten- 
tial per unit volume. In the balance equation (8-21a), jlo, represent 
the dissipative part of the entropy flux and the entropy density produc- 
tion (commonly referred to as the entropy production), the latter being 
the source term for entropy arising from the dissipative processes taking 


place in the fluid. Note that, in eq. (8-21a), psv represents the reactive, or 
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non-dissipative, part of the entropy flux. The dissipative entropy flux, on 


the other hand, is given, in terms of the heat flux, by 


:[D] 


Js (i = <), (8-2 1c) 


e 


Pa 


where TI -v represents the vector with Cartesian components 4 Il,,v; = 


1.93), 


For a fluid made up of point particles with isotropic interactions be- 
tween them, the angular momentum is conserved both locally and 
globally, which is why a separate balance equation for the angular 


momentum is not necessary. 


Finally, the entropy production o, in (8-21a) is given by 


1 1 

_~_l ow igpit _Oww; : 

o,=—ZiP!- VT FD MeOivy, (8-214) 
aj 


where the double sum in the second term on the right hand side repre- 
sents the inner product of the tensors TT and Vv. It may be noted that 
the entropy production is of the general form (8-10b) with an appropriate 
identification of the relevant thermodynamic fluxes and affinities (check 


this out). 


In order to have a closed set of equations describing the evolution of the 
macroscopic state of the fluid, the balance equations are to be supple- 
mented with a set of constitutive equations relating the thermodynamic 


fluxes and affinities, where a set of transport coefficients make their ap- 
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pearance. The transport coefficients are the phenomenological expres- 


sions of the dissipative effects within the fluid. 


In the case of an isotropic fluid made up of point particles, for which 
the interaction potential between the particles is spherically symmetric, 
a vortex motion in the fluid cannot be created by means of microscopic 
processes, and hence the antisymmetric part of the pressure tensor TT is 
identically zero for such a fluid. One can then specify the relation between 
affinities and fluxes in terms of three kinetic coefficients (recall that the 
kinetic coefficients appear as constants of proportionality between ther- 
modynamic fluxes and affinities) — two corresponding to scalar affinities 
and one corresponding to an affinity of a symmetric tensor type. However, 
the transport coefficients commonly used to characterize the relation be- 
tween the fluxes and forces introduced above — the latter appearing in 
the form of spatial gradients of intensive thermodynamic variables — dif- 
fer from the kinetic coefficients in that the ‘forces’ differ from the affinities 
by certain multiplicative factors (for instance, the affinity corresponding 
to a temperature gradient is V(+) rather than —VT). The three transport 
coefficients are the thermal conductivity K, the shear viscosity 7, and the 


bulk viscosity ¢, defined by means of the following relations 
“(D] _ Ty) = gal 2 
jl = -—VT, (I1)® = -2n(Vv)", T= - So Ti = —CV -v, (8-22) 


where the super-index ‘[s]’ is used to denote the symmetric part of a sec- 


ond rank tensor (thus (Tr)! = 5(IL; + Uj) — 36:5 0, Une (¢,7 = 1,2,3)). On 
substituting in the expression (8-21d), one finds that positive definite- 


ness of the entropy production o, (as required by the second law of ther- 
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modynamics) is ensured if all the three transport coefficients (K,7,¢) are 
positive, which is precisely what is found to be the case in all experimen- 
tal observations. This is consistent with the statement that the transport 
coefficients bear the imprint of the microscopic dissipative processes in 
the hydrodynamic regime characterized by large scale variations of the 


macroscopic variables p, pv, u, s. 


The fact that the transport coefficients and the kinetic coefficients are all 
positive, is, in the ultimate analysis, a consequence of the second law 
of thermodynamics which, as we have seen, is accommodated into the 
equilibrium ensembles of statistical mechanics in the form of the relevant 
variational principles satisfied by the respective probability distributions 
(e.g., the maximum entropy principle in the microcanonical ensemble — 


sec. 2.1.2). 


Making use of the constitutive relations, one obtains the mass balance, 


momentum balance, and entropy balance equations in the following form 


OF 329 Sal, (8-23a) 


a 1 
— +V- (pvy) = —Vp +nV°v + (6+ sn)V(V -v), (8-23b) 


Ops -(D K 2 21 s]}2 
a ON ee ee VE ee 


¢ (Vv), (8-23c) 
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(check out formulae (8-23b), (8-23c)). Note that the dissipative entropy 
current j0 occurring on the left hand side of (8-23c) can be replaced with 


—£VT. 


Eq. (8-23b) is referred to as the Navier-Stokes equation of hydrodynam- 


ics. 


The equations (8-23a) - (8-23c) are to be complemented with the energy 
balance equation (8-20a), with the non-dissipative energy flux ju given 
by (8-20b). On making use of the expression (8-20c) for the energy density 
as also of the local form of the second law, along with the local form of 
the Gibbs-Duhem formula (dy + sdT — ; = 0), one can express a pe in the 


form 


utv: (5,P¥) — vu. (8-234d) 


Among the non-dissipative and dissipative energy currents ju, ie), while 
the former is given by (8-20b), the latter can be expressed in terms of the 
transport coefficients by making use of (8-21c), and the three formulae 
in (8-22). Finally, the internal energy density u can be expressed in terms 


of the temperature in the form 
u= pceyT, (8-23e) 


where cy the specific heat (per unit mass) at constant volume has been 


assumed to be a constant. 
With all this (equations (8-23a) - (8-23c), and (8-20a) with the substi- 
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tutions mentioned above) one has a closed set of equations of hydrody- 
namics of a simple fluid involving the three transport coefficients K, 1, ¢. 
Indeed, one can, along with the mass and the momentum equations, 
choose any one of the energy and the entropy equations, and make use 
of known relations of equilibrium thermodynamics (since such relations 


hold between local densities) for a complete description. 


8.2.3.3 Linearization: propagating and diffusive modes 


While the equations of hydrodynamics for a simple fluid outlined in sec. 8.2.3.2 
above constitute a closed set, these are, at the same time, of a complex 
nature since they form a set of coupled non-linear partial differential 


equations. 


One can linearize the hydrodynamic equations around the equilibrium 
solution p = po,T = To,v = 0, where these are homogeneous stationary 
values characterizing the equilibrium state of the fluid under given con- 
ditions, for which v = 0 in the center of mass frame (the other thermo- 
dynamic parameters being determined in terms of the known thermody- 
namic formulae). The linearized equations are conveniently analyzed by 
looking at Fourier modes corresponding to various possible wavelengths 
(\) and associated wave vectors (k). The wave vectors form a discrete set 
for a system of (large) finite volume (V) under periodic boundary condi- 
tions, which go over to a quasi-continuous set for V > oo. For any given 
k, the solution for a local thermodynamic variable typically appears in 
the form A,e~**‘*"*T where A is an amplitude while the temporal variation 


of the ‘mode’ under consideration depends on the parameter s, which 
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can be either real and positive or complex with a positive real part, as 
required by the second law of thermodynamics according to which the 
transport coefficients K,7,¢ are to be all positive (recall that this is nec- 
essary to make the entropy production (8-21d) positive definite). In the 
former case the mode under consideration is said to be a diffusive one 


while the latter corresponds to a (damped) propagating mode. 


Let us denote the relevant variables in the five independent linearized 
hydrodynamic equations (three for momentum density components, one 
for mass density, and one for energy or entropy density) by a,(r,t)(j = 
1,2,--- ,5), these being the deviations from the equilibrium values men- 
tioned above (one can conveniently choose the density, temperature, and 
the three velocity components as the relevant independent variables). For 
any chosen value of the wave vector k, one assumes that these are of the 
form Aje~**'*"*" (7 = 1,2,--- ,5) where at least some of the A,’s are non- 
zero. On Substitution in the linearized equations, the condition for exis- 
tence of a non-trivial solution for the A,’s is obtained in the form of a5 x5 
determinant being equal to zero. This then gives,in general, five solutions 
for s, specifying the temporal behavior of the five possible hydrodynamic 


modes of a fluid for the given wave vector k. 


Among the five possible hydrodynamic modes [117], two are transverse 
ones corresponding to damped non-propagating shear modes, for which 


s; as a function of the wave number (k = +) is given by 


oe (8-24a) 
Po 
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These transverse modes describe the variations of the transverse compo- 


k 


nents of velocity (v“ = v— 5(v-k)) and are decoupled from the remaining 


three modes, all longitudinal. 


The three longitudinal modes describe the variations of p,T and v“) = 


k 


;2(v -k) which are coupled together. One of these corresponds predomi- 


nantly (i.e., in an approximate description) to the conduction of heat, and 
is diffusive (i.e., damped and non-propagating) in nature, with s, given 
by 

is 


Sy = ——k’, (8-24b) 
POoCp 


where c, stands for the specific heat (per unit mass) at constant pressure 
whose variation with temperature is ignored under the approximation 


involved. 


The remaining two longitudinal modes describe the damped propagation 


of sound waves with s, given by 


sy = Tk? + ieok. (8-24c) 


Here Wine (ey en 


Dp stands for the velocity of sound in the fluid, 


and T = 3[2-(¢ + 3n) + 34, (2 — 1)] represents the damping rate. The 
damping occurs because of the involvement of heat flow and viscous flow 
in sound propagation. In a non-dissipative medium (K,7,¢ all zero) the 
sound waves (for which v is parallel and anti-parallel to k) propagate with 


velocity c without being damped. 
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The linearized equations of hydrodynamics are of great significance in 
that these explicitly tell us that, close to equilibrium, the various modes 
of the relevant local thermodynamic variables approach their respective 
equilibrium values in an exponentially damped manner, and that the 
modes with relatively small wavelengths (large k) are damped rapidly 
while those with large wavelengths (k — 0) are the ones that dominate 
the asymptotic (t — oo) approach to equilibrium, with the damping rate 
given by (refer to (8-24a), (8-24b)) 


sy ~ Dk’, (8-24d) 


where D stands for some relevant diffusion constant. In the thermody- 
namic limit, the wave numbers / accumulate at k = 0, and the super- 
position of a continuously distributed set of modes results in a different 


time-dependence (~ t~7 (y > 0)) of approach to equilibrium. 


While the thermodynamic variables of a simple fluid corresponding to 
the globally conserved quantities have been considered above, more com- 
plex systems such as a non-reacting multi-component fluid are described 
in terms of a larger number of conserved quantities (the respective mo- 
lar fractions) where the non-equilibrium time evolution involves a larger 
number of transport coefficients (including the mutual diffusion con- 
stants), but the transport equations (refer to [117], chapter 10) are of 


the same general nature as above. 


Additional features appear in systems characterized by spontaneously 


broken symmetries (refer back to sections 6.2.5.5, 6.3.3, 6.5.2, among 
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others) where the relevant order parameters are to be included as the 
slowly varying thermodynamic quantities. Once again, the general fea- 
tures of approach to equilibrium are found to be analogous to those in a 


simple fluid. 


The equations of non-equilibrium thermodynamics, based on phenomeno- 
logical considerations, are corroborated in approaches based on kinetic 
theory (see sec. 8.3 below) and linear response theory (section 8.5) where, 
additionally, one obtains the transport coefficients in terms of features of 
the microscopic dynamics of the systems. In the linear response the- 
ory, the results appear in the form of the Kubo formulae as outlined in 


sec. 8.4.11. 


8.3 Kinetic theory: the Boltzmann equation 


8.3.1 The Boltzmann equation: introduction 


The Boltzmann equation was developed to obtain the macroscopic prop- 
erties of dilute gases from the interactions of their molecules in motion. 
It constitutes the crowning achievement of the kinetic theory of gases, 
where the dynamical aspects of the molecular motions are integrated with 
statistical considerations in arriving at equilibrium and non-equilibrium 
properties of a gas, the latter emerging as consequences of the Boltzmann 
equation. In the present section I give the derivation of the Boltzmann 
equation in the barest outline (see [29], [101], [67], [81], [16], [17] for de- 
tailed derivations and for various applications of the Boltzmann equation; 


David Tong’s book [137], available on the internet, is an excellent intro- 
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duction to a wide range of ideas in non-equilibrium statistical mechanics, 
with a clear and readable exposition on the Boltzmann equation). While 
the Boltzmann equation applies to dilute gases, the approach of kinetic 
theory extends to more dense systems as well, though with much less 


success. 


In the kinetic theory, one defines a set of distribution functions of which 
the one particle distribution function is of basic relevance for a dilute 
gas. More generally, one can define n-particle distribution functions 
(n = 1,2,---), these being relevant for systems with higher concentra- 
tions of constituent particles. For a system in equilibrium, the n-particle 
distribution functions can be related to the equilibrium ensembles of sta- 
tistical mechanics, as in sec. 4.5.5. However, the n-particle distribution 
function can more generally be defined, in principle, for a system out of 
equilibrium as well, and a hierarchy of equations describing the evolution 
of these distribution functions can then be set up as in the equilibrium 
case. The question then comes up as to how one can truncate the hierar- 
chy so as to obtain a closed set of evolution equations, which is where the 
kinetic theory approach meets with obstacles when concrete problems are 
addressed. In the case of a dilute gas, the assumption of molecular chaos 
introduced by Boltzmann leads to a closed evolution equation involving 
the one-particle distribution function which, precisely, is the subject un- 


der consideration in the present section. 


Statistical mechanics, on the other hand, seeks to work with distribution 


functions for the entire system at hand, and constitutes a complementary 
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discipline. The setting up of a non-equilibrium distribution function for an 
entire system is, however, a tall order in general. One can still make good 
use of an equilibrium distribution function and obtain meaningful results 
in the linear non-equilibrium regime. This will be our subject of inquiry in 
sec. 8.4 below. More generally, one can extend the method of statistical 
ensembles to accommodate non-equilibrium steady states not necessarily 
close to equilibrium ones, as we will see subsequently (sections 9.10, 9.12). 
A number of results of fundamental relevance have also been obtained in 
respect of transient processes of a more general type beyond the limits of 


the linear regime (sec. 9.14). 


Though the Boltzmann equation constitutes a closed evolution equation 
for the one-particle distribution function, it poses almost insurmount- 
able barriers to an attempt at a general solution under arbitrarily cho- 
sen initial- and boundary conditions. Approximation methods of limited 
scope have been developed producing results for transport properties of 


dilute gases, which we will outline below. 


After outlining the derivation of the Boltzmann equation in 8.3.2 below, 
I will introduce, in 8.3.3 the Liouville equation, which constitutes the 
cornerstone of kinetic theory and non-equilibrium statistical mechan- 
ics. This will be followed by the formulation of the BBGKY hierarchy in 
sec. 8.3.4, within which the Boltzmann equation appears under a special 


assumption, namely, that of molecular chaos (see below). 


950 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


8.3.2 The Boltzmann equation: derivation in outline 


We consider a dilute gas made up of a large number (N) of molecules, 
to be referred to as ‘particles’, contained within a large volume (V). The 
particles are assumed to interact with a strongly repulsive short-range 
force, the range (c) of the force being assumed to be small compared to 


the mean separation between the particles. In other words, we assume 
No® << V. (8-25) 


Though this gives a specific meaning to the term ‘dilute’, one has to as- 
sume simultaneously that the particles undergo collisions with one an- 
other, the mean number of collisions suffered by a particle in unit time 
being finite. The latter number is of the order of 072, (reason out why) 
where c stands for the mean speed of a particle. More specifically, the 


condition of validity of the Boltzmann equation reads 
N 
—t' 30, =o eS v, (8-26) 


where v(¥ 0) denotes the mean number of collisions suffered by a particle 
per unit time. These are said to define the Grad limit, which needs to be 
approximated closely by a gas so as to qualify for a description in terms of 
the Boltzmann equation (this is commonly referred to as the Boltzmann- 


Grad approximation). 


The one particle distribution function f(r,p;t) is defined (in the same 
spirit as in sec. 4.5.2) by referring to the six-dimensional p-space of an 


individual particle in the gas, made up of its position and momentum 
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vectors by requiring that the number of particles in a small volume ele- 
ment 6rd“)p around the point (r,p) in this space at any time instant ¢ 
be f(r, p;t)6®rdp. For a sufficiently large value of N, one can assume 
that the distribution function so defined is a well-behaved one (strictly 
speaking, one needs to define f in terms of a time average over a short 


time interval, see [17]). 


We now work out the change in the one particle distribution function at 
(r,p) during a short time interval dt. This change, (df = f(r,p;t + 6t) — 
f(r, p;t)), can be expressed as the sum of three parts arising principally 
due to three distinct processes: (A) the free flight of particles in between 
successive collisions, (B) the decrease in the number of particles within 
the chosen volume element by means of collisions, and (C) the reverse 
process of increase in the number of particles within the chosen volume 
element as a result of collisions. Here the term ‘collision’ is used to mean 
an interaction or scattering involving two particles, where the interaction 
takes place when the separation between the particles is small enough so 
that they are within the range of repulsion of each other. A limiting case 
of such short-range repulsive interaction is the contact repulsion in the 
elastic collision between two hard spheres. The ‘hard sphere’ model is 
a commonly invoked simplification in the literature relating to the Boltz- 
mann equation. In between the collisions, the particles move with con- 
stant velocities in free motion. Incidentally, a fundamental assumption 
in the derivation of the Boltzmann equation, associated with the condi- 
tions (8-26), is that all collisions are of the two-particle type, i.e., three 


or more particles do not get involved in a scattering event. We now con- 
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sider below each of the above three types of processes contributing to the 


evolution of the one-particle distribution function. 


(A) The process of free flight of particles in between collisions is respon- 
sible for a contribution to df, that we refer to as df... We assume for the 
sake of simplicity that there is no external field acting on the particles. 


The modification in the presence of an external field will be included next. 


Corresponding to the chosen volume element 6°)rd“)p around the point 
(r,p), we consider another element of equal volume around the point 
(r — Pdt,p) (m = mass of each particle, where we assume for the sake 
of simplicity that all the particles are identical), the two volume elements 
being marked, respectively, A and B in fig. 8-1 below. Noting that volume 
elements in the phase space of a particle remain unaltered in magnitude 
during its motion (a fundamental result in Hamiltonian mechanics), par- 
ticles in B move over to the volume element A as a result of free flight 


(reason this out). In other words, 


Oftree(?, p;t) =f(r, p;t + ot) — f(r, p;t) 


=f(r — Pét,p;t) — f(r, p;t) = +p “Vif (r,p;t)ot,  (8-27a) 
m mm 


(check this out). In the more general case where an external field acts on 
the particles making up the system, one obtains an additional term on 
the right hand side since now the element B mentioned above is required 


to be located at (r — 2dt,p + +2%5t) where V(r) stands for the potential 


m Or 


energy due to the external field of a particle located at r, the force on the 
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particle being a One then obtains 


r° 


1 Of(r,p;t) OV Of(r,p;t 
Ofiree(t, P;t) = [-—p- Ae.P dy Or are 


Jot 


where the last line involves the Poisson bracket of the Hamiltonian of a 


single particle in the external field, 
H=——+4Vv(r), (8-27c) 
m 


with the single particle distribution function f. The label ‘free’ is com- 
monly used even in the presence of the external field, to signify that the 
formula (8-27b) does not include the effect of the mutual interaction be- 


tween the particles. 


A 


xi 


3p AP) 


“(py dt, P) 


Figure 8-1: Depicting volume elements A, B, each measuring 6°)rd@)p 
(schematic) in the single-particle phase space (i.e., the j-space), the two be- 
ing around the points r, p, r — P.dt, p respectively, where dt denotes a small time 
interval; particles in B move over to A during the interval dt in the course of their 
free flight, i.e., rectilinear motion in regions away from those where their motion 
is influenced by the short range repulsive interaction; the latter is referred to as 
a collision (see fig. 8-2); we assume for the sake of simplicity that the particles 
of the system are not acted upon by an external field; generalization to include 
an external field is straightforward. 
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(B) Particles are knocked out of the chosen volume element A within any 
specified time interval dt in virtue of collisions undergone by particles 
of velocity P with those having all possible velocities. Focusing our at- 
tention on any one particle of momentum p and considering particles 


of momentum p’ streaming past it, one can define a differential scatter- 


do 
dQ 


ing cross-section “(p,p’) as follows. We consider only those collisions 
in which the particles immediately after a collision have momenta P, P’, 
and describe such a collision in relative co-ordinates in which the relative 
distance between the particles and the relative velocity of the second par- 
ticle with respect to the first undergo a change as a result of the collision. 
Assuming that the interaction between the particles is a central one, the 
trajectory of the second particle with reference to the first (these we refer 
to as the scattered particle and the scatterer respectively) remains con- 


fined to a plane, and the direction of the relative velocity suffers a change 


without any change in its magnitude. 


Fig. 8-2 depicts such a collision in which the polar angle of the direc- 
tion of the final relative velocity with respect to the direction of the initial 
relative velocity is 6. Further, the azimuthal angle between the plane 
of scattering and a reference plane containing the polar axis (the line 
through the scatterer O, along OP in the figure, parallel to the direction 
of the initial relative velocity), is assumed to be ¢. In other words, the 
polar and azimuthal angles of the line of flight of the scattered particle 
after the collision are (@,¢). In the figure, the initial line of flight is seen 
to intersect a plane through O and perpendicular to OP, at a point C with 
plane polar co-ordinates (b, ¢) (where OC= b and ZXOC= 4). The spherical 
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polar angles 6, ¢ are determined uniquely by the plane polar co-ordinates 
b,@ (indeed, ¢ remains unchanged in the collision as a consequence of 
our assumption that the interaction occurs through a central potential). 
Equivalently, for given initial momenta p, p’, of the scatterer and the scat- 
tered particle, the final momenta P, P’ are uniquely determined by 8, ¢ or, 
equally well, by 0,¢. Imagining a stream of scattered particles to be inci- 
dent uniformly on the plane XOC, one can define, in a hypothetical sense, 
an effective area (call it da) in this plane, such that particles incident at 
any point within this area are scattered in directions with spherical polar 
angles 6 to 6+ 66 and ¢ to ¢+6¢, corresponding to a small solid angle, say 
dQ generated at O. The ratio oa in the limit of infinitesimally small values 


of 60,5¢ is then the differential scattering cross-section that we have de- 


noted above by “(p,p’). For a given interaction potential, it depends on 
the initial momenta p,p’ as also on the angles 0, ¢ (equivalently, on 8, ¢; 
b is referred to as the ‘impact parameter’ of the collision; in the case of 


interaction through a central potential, ¢ is a redundant co-ordinate). 


Let us imagine a cylinder erected on the hypothetically defined area da, 
with its axis parallel to OP, and with length PPI §¢, Particles with mo- 
mentum p’ contained within such a cylinder will collide with a target 
particle (scatterer) in time 6¢. The number of such particles (those with 
momentum p’) is, by the definition of the one-particle distribution func- 
tion f, given by f(r, p’;t)2—!5@)p'dadt. In order to work out the number of 
p-particles knocked out in time dt out of the volume element A indicated 
above we have to multiply this with f(r, p;t)d°rd®p, i.e., the relevant 


number of target particles (it can be assumed that, for sufficiently small 
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1 -P) 


Figure 8-2: Depicting a two-particle scattering event (schematic); relative to 
one of the particles (the scatterer) at O, the other particle is incident from a large 
distance with an impact parameter b; angles are measured in a spherical polar 
co-ordinate system with the polar axis along the line OP, parallel to the relative 
momentum p’ — p; 6 denotes the angle of scattering (i.e., the angle between the 
initial and final directions of motion of the incident particle), while ¢ denotes the 
azimuthal angle with reference to the plane XOP shown in the figure, OX being a 
reference line perpendicular to OP; in the case of scattering by a central poten- 
tial, the initial and final lines of flight lie in a single plane (the plane of the figure, 
containing the point O and the initial line of flight); in other words, 6,¢ are the 
angles of scattering, the differential scattering cross section being independent 
of ¢ for any given value of p’ — p; however, it depends on the impact parameter 
b; considering a plane through O perpendicular to OP, particles incident on the 
area bdbd¢ on it are scattered within a small solid angle dQ subtended at O (b,¢ 
are plane polar co-ordinates in this plane); OQ denotes the closest distance of 
approach between the two particles, the unit vector along which is denoted by n. 


dp, a collision of the type indicated will indeed result in knocking out a 
particle from the volume element A). Considering all possible momenta p’ 
of the particles participating in collisions with particles in the element A, 
and all possible directions of the scattered particle one obtains the fol- 


lowing expression for the number of particles knocked out of the element 


6 r5@)p (referred to as A above) in time dt as 


_ '_ p| da 
bf P5Or5® p = 6x5 pst / apanP — Pl ip py str pire p’;t), 


(8-28a) 
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(check this formula out), i.e., the number knocked out in time dt per 
unit volume element of the single-particle phase space (i.e., the p-space) 


around the point (r, p) is 


= / = do / / 
Ge. pst) = ot f aplanP PLE (p, py f(r pso.fleps). (6-280) 


(C) In order to work out the number of p-particles knocked into the cho- 
sen volume element 6“)rd“)p (element A of fig. 8-1) in time 5¢ we note that, 
corresponding to every collision knocking a p-particle out of A (this we 
refer to as a ‘direct’ collision), there exists a uniquely defined restituting 
collision that knocks a p-particle into A, where the restituting collision 
is specified as follows: supposing that the direct collision involves ‘in- 
coming’ particles of momenta p,p’, and ‘outgoing’ particles of momenta 
P,P’, the restituting collision will have P,P’ as incoming momenta and 
p,p’ as outgoing momenta (i.e., symbolically, P,P’ — p,p’); moreover, if 
the unit vector along the relative separation between the two particles be 
n when they are at the closest distance of approach in the direct collision, 
the corresponding unit vector in the restituting collision is —/ (see fig. 8- 
3). It is a consequence of the dynamics, i.e., of the equations of motion, 
that P,P’ are uniquely determined by p,p’, once 7i is specified and that 
the above correspondence between the direct and restituting collisions is 
well defined. Since each restituting collision adds one p-particle to the 
volume element under consideration in the p-space just as a direct col- 


lision takes away a p-particle from it, an argument paralleling the one 
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leading to (8-28b) gives (in a notation that I hope is self-evident) 


/ — 

iee.pit) = ot f a pao =P (p pypePi)f(e.Pit), (8-29) 
m 

where we have made use of the conservation of the magnitude of the 

relative velocity in an elastic collision and of the conservation of volume 

do 


element in the phase space. By definition of the restituting collision 


remains the same in (8-28b) and (8-29). 


Figure 8-3: Showing the relation between a direct collision and the correspond- 
ing restituting collision; in the figure to the left, showing the direct collision, 
the incoming momenta p, p’ are changed to outgoing momenta as p,p’ > P,P’, 
while in the figure to the right, showing the restituting collision, the incoming 
momenta P,P’ get transformed as P,P’ > p,p’; in addition, the unit vectors at 
the point of closest approach in the two cillisions are as shown, the two unit 
vectors being oppositely directed; based on [29], fig. 2.5. 


Putting together the changes in the one-particle distribution function on 
all the three counts, we obtain, for the net rate of change of f(r, p;t), the 
Boltzmann equation, 


|p’ — p| do 


Of (r,p;t) do 
m dQ 


Ot = (ff' ~~ ive. (8-30) 


{H, f}+ [evan 


where the first term on the right hand side represents the ‘streaming’ part 
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of 2f 


3p as obtained from (8-27b), (8-27c). 


In this non-linear integro-differential equation, the last factor under the 
integral sign (f f’—f f’) is an abbreviation for f(r, P;t) f(r, P’;t)— f(r, p;t) f(r, ps0). 
Further, one can replace, within the integral sign, d0“Z(p, p’) with bdbd¢ 
(in the notation explained above; note that the area da referred to above 
can be written, within the integral sign, as bdbdé — reason out why) while 
recalling that, relative to the scatterer, all the particles with relative mo- 


mentum p’—p incident on an area bdbd¢ in the plane XOC shown in fig. 8-2 


are scattered in the direction 0, ¢ within a solid angle dQ = sin éd6d¢. 


The Boltzmann equation (8-30) is based on the fundamental assumption 
of molecular chaos introduced by Boltzmann, which essentially means 
that the molecules of momenta p, p’ undergoing the collision are uncor- 
related, i.e., the distribution functions f, f’ corresponding to the two mo- 
menta are independent of one another. It is due to this assumption that 
one can use the product of the two functions in the expression (8-28a) 
in which one of the two momenta is integrated over (if the assumption 
of molecular chaos were not made, one would have to introduce a two- 
particle distribution function in the place of the product). While the par- 
ticles about to collide are uncorrelated, those coming out of a collision 
are not so, though their correlation gets lost once the particles move out 
of their mutual interaction range and get involved in other collisions as 
incoming particles. Thus, P,P’ are correlated with p, p’, while the corre- 
spondence between the direct and restituting collisions, along with the 
fact that there is no correlation between colliding molecules, allows us to 


write (8-29), featuring the product f(r, P;t) f(r, P’;t). 
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From the principles of conservation of energy and momentum, the re- 
lation between the final and the initial momenta of the two interacting 


particles in the direct interaction are obtained as 


P=p+(p’—p)-An, P’=p’—(p'—p)- fa. (8-31) 


In the special case of a hard sphere collision, if the radius of a hard 
sphere representing a molecule be r, then one can effectively describe 
a collision as that between a point particle (representing the center 
of a hard sphere) with a sphere of radius R = 2r (the ‘sphere of influ- 
ence’; R is referred to as the ‘collision diameter’) and one finds that 
the angle between the vectors p’ — p and fis 5 + a where @ is the an- 


gle of scattering, related to the impact parameter as b = Rcos 4 (check 


these statements out). 


Strictly speaking, the assumption of molecular chaos does not follow 
from dynamical principles alone. In other words, the Boltzmann equa- 
tion (or any other basic principle of statistical mechanics) is not reducible 
to dynamical principles. On the other hand, the assumption of molecular 
chaos, which is essentially of a statistical nature, is an eminently rea- 
sonable one from an intuitive point of view and, moreover, conclusions 
drawn from the Boltzmann equation are quite well borne out in exper- 
imental observations. It is in this context that the theory of dynamical 
chaos, to be briefly outlined in section 9.5, assumes relevance. The be- 
havior of a chaotic dynamical system is described meaningfully only in 
statistical terms. However, dynamical chaos alone does not constitute 


the ultimate justification of the principles of statistical mechanics, where 
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the large number of the constituent particles of a system turns out to be 
of equal relevance. Put differently, the conclusion is inescapable that sta- 
tistical mechanics is a discipline emergent from dynamics while not being 


reducible to it. 


Incidentally, the Boltzmann equation (8-30) can be written in an alter- 
native but equivalent form in terms of what can be referred to as the 
scattering probability (w(p, p’ > P, P’)) of a collision event p > P, p’ > P’, 
defined by requiring that the rate (per unit volume) at which a p-particle 
at r, by colliding with a particle in the momentum range p’ to p’+dp’, also 
located at r, produces two particles in momentum ranges P to P + dP, P’ 


to P’ + dP’ be 
rate = w(p,p’ > P,P’) f(r, r, p, p’)d® pd) Pd®)P’. (8-32a) 


By invoking time reversal invariance and assuming that the scattering 
process is also parity invariant, one finds that the scattering probability 


satisfies the symmetry relation (refer to [137]) 
w(p, p’ > P,P’) = «(P,P’ > p,p’). (8-32b) 


On invoking the assumption of molecular chaos on obtains the alternative 


form of the Boltzmann equation sought for 


or =a dor i d?)p'd) Pd® P'w(p, p’ > P,P’)((ff' — ff’), (8-33) 


where H stands the single-particle Hamiltonian and f, f’, f, f’ have the 
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same significance as in (8-30). An analysis of the kinematics of the scat- 
tering process enables one to relate the scattering probability w with the 


differential cross-section ae and thereby arrive at (8-30) from (8-33). 


Referring to the expressions for a in (8-30), (8-33), the second term on 


the right hand side of either formula gives the contribution of collisions 


to the rate of change of the single-particle distribution function, which we 


denote by 
aa) = [e?v'ePaP(p.p + P,P\(FF - ff’) 
= (3)! Ip’ —p| do =m ! - 
= f @panP—PO UF - ff). ey 


This is referred to as the collision integral. As already mentioned, the ex- 
pression f f’— ff’ is an abbreviation for f(r, P;t) f(r, P’;t) — f(r, p;t) f(r, p’;t). 


We shall meet with this object in sec. 8.3.5, on the H-theorem. 


8.3.3 The Liouville Equation 


The equilibrium ensembles of statistical mechanics correspond to time- 
independent phase space distribution functions p(r!!, p!!) in the classi- 
cal description. For a non-equilibrium situation, on the other hand, the 
time evolution of a macroscopic system is described in terms of a time- 
dependent distribution function p(r'!, p!%!; +), where p(r!!, p!%); t)d!Ird Ip 
gives the probability that the microstate of the system under consid- 
eration is located within a volume element 6!!ré!|p around the point 
(r'“], p!’!) at the time instant ¢. The question then arises as to how, given 


the Hamiltonian of the system, the distribution function for the evolving 
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system changes with time. 


Recall that the symbols rl“! and p!*) stand for the collections of po- 
sition and momentum vectors ({r),r2,---ry}, {P1,P2,--:py}) for 
a system made up of N particles. The Cartesian components of 
these vectors are denoted by {¢} = {q,4,--- ds}, {p} = {p1, P2,°++ » ps} (s = 
3N) (alternatively, these may denote appropriately defined gen- 
eralized co-ordinates and corresponding generalized momenta). 
Thus, the sub-index i in r; or p; runs from 1 to N while that in q; 
or p; runs from 1 to s = 3N. In the following the 6N phase space 
co-ordinates {q},{p} will be denoted jointly by z, made up of 
components z;, with i running from 1 to 6N. The symbol 6z will 


stand for 6!!rd!p (i.e., within an integral sign, d!\!rd!“!p — dz). 


This evolution equation is not difficult to arrive at. If the system is in a 
microstate corresponding to the point {q}, {p}, ie., z (in an abbreviated 
notation), in the phase space at time ¢ then, at time ¢ + dt, it will be in the 
microstate {q + qdt}, {p+ pot} (i.e., z + 26t in brief) for sufficiently small dt. 


This implies 


p(z;t) = p(z + 26t;t + dt), (8-35a) 
(reason this out), i.e., 
Op Op. 
ape eet ay eee 1 -35b 
7 + Be” 0 (8-35b) 
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The last term on the left hand side of the above equation is an abbreviation 


6N Op; 
for ran Bz, “i° 


In terms of {q}, {p}, the above equation reads 


Op wo. . OD. 
Py >» Laat P 6, = 0. (8-35c) 
If the system undergoes Hamiltonian evolution with Hamiltonian H({q}, {p}), 


then one can make use of the Hamiltonian equations of motion to obtain 
—+{p,H}=0. (8-35d) 


Any of the equations (8-35b)- (8-35d) can be taken as the statement of Li- 
ouville’s equation, where a Hamiltonian time evolution of the microstates 


is assumed. 


Strictly speaking, the Hamiltonian time evolution of a system leaves no room 
for dissipation. One way to include dissipation in the theory is to intro- 
duce a statistical assumption such as that of molecular chaos mentioned in 
sec. 8.3.2. Another basic approach in non-equilibrium statistical mechanics 
is to make use of an effective non-Hamiltonian evolution of the microstates. 
This will be taken up in sec. 9.9. Finally, the reduction from a large sys- 
tem to a smaller subsystem makes the evolution of the latter dissipative in 


nature (sec. 8.7). 


The Liouville equation can be interpreted as saying that the distribution 
function evolves in the phase space in a manner analogous to an incom- 


pressible fluid. The conservation of volume for the latter is analogous 
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to the conservation of the total probability ({ p(z;t)dz = 1) in the phase 
space. Indeed, the Liouville equation in the case of Hamiltonian evolu- 
tion is equivalent to the Liouville theorem which states that the volume 
of a region in the phase space occupied by an initial distribution of mi- 
crostates remains unaltered as the microstates evolve in time. The shape 
of the region may change, but the volume does not (see fig. 8-4). Indeed, 
the shape change is of crucial relevance in explaining the approach to 
equilibrium from an initial macrostate, and the attendant entropy pro- 


duction. This we will have a look at in chapter 9. 


Figure 8-4: Depicting the time evolution of a distribution of microstates in 
the phase space (schematic); an initial distribution marked A, evolves in time 
in accordance with the Liouville equation (formula (8-35d)), preserving its vol- 
ume but progressively changing shape (A — B — C); for a system made up ofa 
large number of interacting particles, the most likely scenario is that the distri- 
bution spreads out in the phase space by the proliferation of thin filamentous 
outgrowths. 


The Liouville equation describes the time evolution of a mixed state in 
the classical description — where the mixed state represents a probability 
distribution over pure states in the phase space - and constitutes the 


basis of non-equilibrium statistical mechanics. This will be taken up 
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again in sec. 8.4.2.1. 


In the quantum context, the corresponding equation describing the evo- 
lution of a mixed state (represented by a density operator) in the Hilbert 
space of a system is referred to as the Neumann equation (or the Liouville- 


Neumann equation), and will be referred to in sec. 8.4.2.2. 


8.3.4 The BBGKY hierarchy 


The N-particle distrubution function p in the 6N-dimensional phase space 
carries far too much of detailed information to be of practical use for a 
system made up of a large number of particles (NV — oo). In most situa- 
tions involving a dilute gas, the 1-particle distribution function f defined 
in sec. 8.3.2 is sufficient ro describe and explain most of the observed 
macroscopic properties of the system. Distribution functions involving 
a higher number of particles are deemed to be useful for more dense 
systems, and one then needs to know how these distribution functions 


evolve in time. 


We define the time-dependent n-particle distribution function 


f(t1,82,°**Pn; P1; Pa; *** Pnjt) (n = 1,2,---N) 


by requiring that f!"I(r"!, p!"!;t)6™(r)6™p give the probability that the 
point describing the state of n number of arbitrarily chosen particles lies 
within a volume element 6'/(r)5!p around the point (r'"!, p') in the 6n 
dimensional phase space relating to these particles (in a notation that 


I hope is self-evident by now). Given the integer n (1 < n < N), f™ is 
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determined by the phase-space distribution p for the entire system as 


fPl(ry, To,°°°fn,P1,P2,°°* Pn; t) 
N-n 


N! 
= nnd I] dr jd prj p(t1, ‘sey, Da? past) (= 12) N). 
ae 5=1 
(8-36) 
The pre-factor of Weal on the right hand side (the same pre-factor ap- 


pears in (4-63a) in the case of equilibrium distribution functions in chap- 
ter 4, where the notation is slightly different) appears because of the fact 
that the particles are all identical, and it does not matter which of the 
particles are marked with sub-indices 1, 2,--- ,n (the first particle can be 


chosen in any one of NV number of ways, the second in N — 1, and so on). 


It is the one-particle distribution function f'"!(r, p;t), abbreviated as f(r, p;t), 
that was the object of interest in sections 8.3.1 and 8.3.2. The BBGKY 
theory provides the setting for the Boltzmann equation by telling us how 
the evolution of the one-particle distribution function relates to that of 


the n-particle distribution functions for n = 2,--- , N. 


The one-particle distribution function is defined in terms of the distribu- 


tion function p(r!!, p!!;¢) for the entire system of N particles as 


N 
ae Pp; t) =N / I] d®r,d p,p(r, Yo,°°* ,fn,P,P2,°°° , PN; i); (8-37) 
1=2 


and, as mentioned above, represents the density of particles in the p- 
space (i.e., the phase space for a single particle) at position r and mo- 


mentum p. Integrated over p, it gives the spatial number density (n(r; t)) 
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at r at time ¢t, the volume integral of the latter being NV, independent of 


time: 


Wet) = [ erepie) fae: iri t) = N. (8-38) 


Starting from the Liouville equation (8-35d), and making use of the defi- 
nition (8-36), it is straightforward to derive the evolution equation for f [n] | 


in which 24" is found to depend on f+"): 


0 ar in] yl. gin 0 cae 

Lf} + 3 fo rand Pra gnOlts = tn) ost. (8-39al 
In writing the above formula, referred to as the BBGKY hierarchy, we have 
assumed the following general form of the Hamiltonian H of the system 


under consideration: 
N p? 
H= Dee + V0) + Dot: — 89): (8-39b) 


where V stands for the potential energy of the individual particles due 
to the external field, if any, acting on the system and ¢(r,r’) stands for 
the potential energy of mutual interaction of two particles located at r,r’ 
(we assume that the interaction is of the two-body type). We then define 
H'™ as that part of the Hamiltonian which depends only on r;,p;, (i = 
1,2,---,n), integrate both sides of (8-35d) over r;,p;, (i =n+1,n+2,--- ,N) 
and make use of relevant boundary conditions (the distribution functions 
vanish outside the volume within which the system is confined and for the 
relevant momentum components going to +oo) so as to arrive at (8-39a) 


on making a few algebraic manipulations (I skip the details here; see, for 
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instance, [101], [16]). 


The BBGKY hierarchy (8-39a) gives us the time rate of change of f!"! asa 
sum of two terms where the first term on the right hand side represents 
the streaming part (comapre with the first term on the right hand side 
of (8-30)), while the second can be interpreted as the effect of the mutual 
interaction between the particles of the system, commonly referred to 
as the ‘collision’ part, and involves /f'"*". Thus, it represents a closed 
system of equations only when all the N number of particles are taken 
into account, in which case the Liouville equation itself appears as the 
last equation in the hierarchy and implies all the preceding equations. In 
order that the hierarchy be of practical value, one needs to truncate it at 


some stage by invoking additional assumption(s). 


One situation where the BBGKY hierarchy appears as a convenient system 
of equations is the one where the inter-particle interaction is a weak and 
long-range one such as the Coulomb force in an ionized gas or the gravita- 
tional force between the stars in a stellar system. In this case the BBGKY 
hierarchy leads, as an approximation, to the Vlasov equation which involves 
only the single-particle distribution function, though in a way completely 
different from what the Boltzmann equation entails (see [16], chapter ID). 
The latter is a useful approximation for a dilute gas of neutral particles 
where the inter-particle interaction is of short range, in the nature of an 


impact between two particles. 
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In the special case of n = 1, eq. (8-39a) gives (with the notation /f!!! — f), 


oe) = pig: [aera pe r’) on ep Pt (8-40) 
where H'! stands for the single-particle Hamiltonian given by (8-27c), 
where it is denoted by H (in the present context, however, H stands for 
the system Hamiltonian, eq. (8-39b)). In this formula, the first term on 
the right hand side is the streaming contribution and the second term 
represents the effect of the inter-particle interaction or ‘collision’. The 
next equation of the hierarchy gives the rate of change of f?! and involves 


the three-particle distribution function /!!: 


OfPl(ry, v2, Pi, Pa; t) 2] ¢[2I (3p. 13), 2 
2 = (HA, 7}4, 97 f dred py -ol0y — 19): 


i=1,2 


Ota) 


(8-41) 


As mentioned in sec. 4.5.5, the BGY equations are the equilibrium (i.e., 
time-independent) version of the BBGKY hierarchy introduced in the present 
section. Indeed, if one assumes the distribution function f!"(rl"!, pl”!,t) to 


be in the form of a product 
fled = gl (gl) eo zakgr LP 
and substitutes in (8-39a), one is led to the BGY hierarchy (4-103) being 


satisfied by g!"! (generalized by the inclusion of an external field acting on 


the system) [101], chapter 18. 


Formula (8-40) leads us to the Boltzmann equation under an appropriate 


scheme of approximation. 
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The BBGKY hierarchy of equations constitutes a convenient cataloging of 
all the information carried by the Liouville equation describing the evo- 
lution of the full time-dependent probability distribution in the phase 
space. Its usefulness lies in breaking up this entire information into 
separate (though not independent) packages such that appropriate ap- 
proximation schemes can be devised so as to extract meaningful results 
by ignoring redundant information within a given context. This is an 
instance of ‘reduction’ or ‘projection’ that constitutes the essence of the 
entire approach of statistical mechanics. Indeed, the very idea of explain- 
ing (or interpreting) the thermodynamic behavior of a system in terms of 
statistical mechanics, which starts from the microscopic description, is 
the ultimate instance of reduction (the terms ‘contraction’ and ‘projection’ 
are also used, depending on the context) where one extracts relevant in- 
formation about a few macroscopic variables from an enormous number 
of microscopic ones (in this context, refer to sec. 8.7 later in the present 
chapter). Sec. 8.3.4.1 below indicates how the BBGKY hierarchy can be 


‘reduced’ to the Boltzmann equation. 


8.3.4.1 From BBGKY to Boltzmann 


It is worthwhile to try to see as to how the Boltzmann equation relates 
to the BBGKY hierarchy. For this, one has to refer to the first two equa- 
tions (8-40), (8-41) of the hierarchy. Referring to the short range (say, d, 
of the order of the atomic size) of the inter-particle interaction and the 
fact that the system under consideration is a dilute gas, it is seen on 
physical grounds that two distinct time scales are involved in these two 


evolution equations, namely, the collision time 7, and the relaxation time 
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T, given by 
a a R= A : (8-42a) 
Urel Urel 
where the two satisfy 
Ta Tp (8-42b) 


In these formulae v,,; stands for the mean relative velocity between any 
pair of particles in the system, and \ stands for the mean free path be- 
tween collisions. One finds from eq. (8-41) that both the terms in the 
expression for of are large for a time interval ~ 7, after which they fall 
off rapidly. On the other hand, the first term on the right hand side of 
eq. (8-40) varies appreciably over the much larger time scale 7,, which 
tells us that, in this equation, fl?! can be replaced with its equilibrium 


solution. 


The equilibrium solution for f!?!(r,r’, p, p’;t) can be assumed to depend only 
on the relative co-ordinate and momentum (r’ —r, p’—p’), ignoring a slow 


residual dependence on the center of mass co-ordinate and mo- 


mentum (, PtP’), 
Making now a good number of algebraic manipulations and invoking the 
assumption of molecular chaos one then finds that eq. (8-40) reduces to 
the Boltzmann equation (8-33) (I skip this derivation here which, how- 


ever,is quite instructive; see [137] for details). 


973 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


8.3.5 The H-theorem and the equilibrium solution 


Starting from the Boltzmann equation in the form (8-33), one can con- 
struct the so-called H-function which is strictly decreasing throughout 
the time evolution of a system out of equilibrium and tends towards 
its minimum value (subject to the given physical constraints of volume, 
energy, and the number of particles, equivalent to a set of initial- and 
boundary conditions), the latter corresponding to the equilibrium state of 
the system under consideration (a dilute gas close to the limits (8-26); we 
consider a one-component gas for the sake of simplicity, while a gas made 
up of more than one components can also be referred to). In this special 
case of a dilute gas the Boltzmann equation, and the H-theorem result- 
ing from it, gives an answer (at least, a partial one) to the general ques- 
tion of why and how a macroscopic system always evolves towards an 
equilibrium configuration, with an attendant increase of entropy, where 
an unambiguous definition of entropy in a non-equilibrium configuration 
can be given in terms of the H-function itself (again, for the special case 
of a dilute gas; a more general definition of entropy is a central prob- 
lem in non-equilibrium statistical mechanics). Briefly, it is the statistical 
assumption of molecular chaos that introduces an arrow of time in the 
evolution of a macroscopic system that takes place in accordance with 


microscopic dynamical laws impartial to the direction of time. 


The H-function (a functional of the one-particle distribution function) is 


defined as 


H(t) = / d®rd®p f(x, p;t)In f(r, pt), (8-43) 
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r] 


which is a well defined quantity unless f(r, p;t) involves singularities. The 


rate of change of the H-function is given by 


dH. 0 
ee (3) pq®) es ‘ 
a drd™p(1 + In f) a. F(e, pst) 
= [ara pin ee (8-44) 


since fd“ rd®)pf(r, p;t)(= N) is independent of time. Invoking the for- 


mula (8-33), we then obtain 


oe = f ard®pd® pid Pd®P'u(p, p’ + P,P’) n F(p)[F(P)F(P) — FP) ACP), 


(8-45) 


where we have explicitly retained the momentum arguments in the var- 
ious occurrences of the single-particle distribution functions since the 
position- and time arguments are the same for all these. Thus, for in- 


stance f(P) stands for f(r, P;t) (this was abbreviated to f in (8-30), (8-33)). 


In arriving at (8-45) from (8-44), we have dropped a term fd rd“) p{H, f} ln f, 
where H denotes the single-particle Hamiltonian, possibly containing 
the potential energy term due to an external field. On performing 
successive integrations by parts, this integral can be seen to reduce 


to zero (check this out). 


This tells us that the rate of change in the H-function is governed entirely 
by the collision integral defined in (8-34), while the streaming term in a 


makes no contribution to oH 
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Since all the momenta in (8-45) are dummy integration variables, we can 
interchange p and p’ (and also P and P’) on the right hand side of (8-45). 


Adding the latter to the resulting equation and dividing by two, one has 


Tag | ard pa pd Pd P'u(p, p’ + P,P) In(F(P)F(P'))LF(P)FP) = f() FP) 


(8-46) 


We now interchange p, p’ with P, P’ and make use of the symmetry prop- 


erty (8-32b) and obtain, again by adding and dividing by 2, 


d 1 
m ==5 [rd pd pd PP up, p > P,P’) 


x [In(f(P)F(P)) — nf) VIF(P)F(P) — fpf), (8-47) 


(check this out). Noting that w is, by definition, a positive quantity, and 
that the rest of the integrand in the above expression, being of the form 
In=(a —y) (x,y real) is always negative unless x = y (in which case dH — 0, 
see below), one finds that H is a decreasing function of time. In other 


words, the H-function keeps on decreasing no matter what, till a stage is 


reached when 


F(P)F(P’) = f(p)f(p’), (8-48) 


for all incoming and outgoing momenta at all points (r) in space. This 
means that, in a collision, the sum of In f for the two incoming particles 
is to be the same as the sum evaluated for the two outgoing particles. In 


other words, Inf is to be a conserved quantity in a collision or, in other 
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words, a collisional invariant. The basic conserved quantities in a two- 
particle collision are the mass, the three components of momentum, and 
the kinetic energy (we assume that the interaction potential is a short- 
range one). Hence it should be possible to express f as a combination 
of these, involving five appropriate parameters, of which one will be an 
over-all normalization constant (fd@rd®p = N). One can choose the 


remaining parameters in such a way that f is expressed as 


exp[—=—(p — Po)’], (8-49a) 


where, n(= n(r,t)) stands for the local number density while 6 and pp can 


also depend on r and t: 
B = Bur t), Po = Po(r, ¢). (8-49b) 


Eq. (8-49a), read with (8-49b), defines a state of local equilibrium, which 
is not yet a state of true equilibrium because of the dependence of the 
parameters on r,t. When substituted in the Boltzmann equation this 
does not entail of = 0 because of the presence of the streaming term, 
though the collision integral (8-34) reduces to zero when the 1-particle 
distribution function corresponds to a local equilibrium. This reflects 
the difference in time scales of variation of the streaming term and the 
collision integral in the Boltzmann equation, where the latter changes 
rapidly on a time scale 7, (the collision time) while the former varies over 


the larger time scale 7, (the relaxation time). 


When a system evolves under given physical conditions from a non- 
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equilibrium configuration, there initially occur relatively fast processes 
where collisions play a dominant role and the H-function decreases rapidly 
till the system approaches closely a state of local equilibrium, differing 
from the latter to only a small extent (this difference will be our principal 
concern in sec. 8.3.7). This is the phase where dissipative processes oc- 
cur over small length- and time scales (atomic dimensions and collision 
time). There then begins a process of slow evolution over relatively large 
length- and time scales (mean free path and relaxation time; the separa- 
tion between small and large scales works well for a dilute gas) when the 
collision integral is close to zero, the H-function decreases (i.e., the en- 
tropy increases, see below) relatively slowly, and the streaming term has 
a more dominant role to play. This is referred to as the hydrodynamic 
evolution since it is described well by the basic equations of hydrody- 
namics and is characterized by a set of transport coefficients that depend 
on the interaction potential between the particles making up the system. 
The phase of hydrodynamic evolution ends up with the system reaching 


the state of true equilibrium. 


1. As we will see shortly (formula (8-57a) below), the H-function, taken 
with a negative sign, is proportional to the entropy density of the sys- 
tem (up to a constant term), and non-equilibrium evolution, which is 
always directed to an equilibrium solution (under given physical condi- 
tions), corresponds to entropy production, which is commonly referred 


to as dissipation. 


2. It is in principle possible that the non-equilibrium state of a system be 
described by a local equilibrium solution, for which the system evolves 
in time but still the H-function does not decrease, i.e., entropy pro- 


duction does not take place. More precisely, this corresponds to the 
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motion of an ideal fluid since such a motion occurs without dissipa- 


tion. 


3. An ideal gas constitutes an example of an ideal fluid. However, a Boltz- 
mann gas, for which the conditions (8-26) are met with, differs from 
an ideal gas in that it admits of non-zero values of a set of transport 
coefficients responsible for entropy production in non-equilibrium evo- 


lution. 


4. A true equilibrium is characterized by af =0 and # = 0. The condition 
ot = 0 by itself does not necessarily imply an equilibrium state since 
it is also satisfied in a non-equilibrium steady state to be discussed 
in section 9.10, where the distribution function is time-independent 
but entropy production continues to take place because of currents, 


or fluxes, set up by thermodynamic forces, or affinities. Similarly, the 


condition we does not, by itself, imply an equilibrium state since it 


may correspond to the flow of an ideal fluid. 


The 1-particle distribution function can be made use of in defining local 
time-dependent averages of observables. For instance, if A(r,p;t) be a 1- 
particle observable (i.e., one depending on the co-ordinates in the p-space 


of a single particle), then the local average is given by 


(A)oe(t,t) = / d®pA(r, pst) f(r, pst). (8-50) 


where the suffix ‘loc’ indicates that the averaging (-) is done on the mo- 


mentum distribution alone, and not over the entire j:-space. 


In the case of a non-equilibrium state corresponding to a local equilib- 


rium distribution (eq. (8-49a)), the parameters (r,t), po(r,t) are related 
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to the local averages of kinetic energy and momentum as 


_ 2 
(Pope). = 8 a(n, 171, (P)ioc = Po(t, t), (8-5 1a) 
m 2 


(check these results out). These equalities define a local temperature and 


a local drift velocity as 
Beet) = (kp B(x, t))*, u(r, t) = ~ Polt, t) (8-51b) 


Restricted to a true equilibrium state characterized by constant values of 
6, Po (independent of r,t; we assume that the system is not acted upon by 
any external field, i.e., the 1-particle Hamiltonian is H = BP’), i.e., one for 


which 


foal = Fest) = 5 (Goa) expl-(p — po)’ (8-52) 


the first equality in(8-51b) is a statement of the principle of equipartition 
of energy corresponding to the temperature T while the second equality 


gives the center of mass velocity u. Introducing the thermal velocity 


yar (8-53) 


ae (8-54) 


where P(v)5v gives the probability that an arbitrarily chosen molecule 


will have a thermal velocity within the small range v to v + dv (check 
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this out). This, precisely, is the Maxwell velocity distribution formula, 
also referred to as the Maxwell-Boltzmann distribution, though the latter 
term is more commonly used to refer to the phase space distribution of a 


system in the classical approximation (see sec. 3.3.1). 


For a dilute gas whose molecules are subject to an external potential V(r), 


the equilibrium one-particle distribution function is given by 


p 


27m 


( 


Tag Pp; t) = 


> i exp[-(5nv” +V(t))], (8-55) 
(check this out) which can be obtained as the phase space distribution 
function (up to a normalization constant) of a system made up of just 
one single particle with Hamiltonian pe +V(r) in contact with a heat bath 
at temperature 7. Indeed, choosing any one particle in a dilute gas at 
temperature 7’, all the remaining particles constitute a heat bath for it, at 


that temperature. 


When the intermolecular interaction is taken into account, the first equal- 
ity in (8-51b) gives only the mean kinetic energy, since the expectation 


value of the total energy depends on the interaction potential. 


The significance of the 1-particle distribution function for a dilute gas 
becomes apparent when one evaluates the H-function for the equilibrium 


configuration described by (8-52). One obtains 


N 1 


3 
7! 


3 
Sumber’! a 


Heq = Nin { | (8-56) 


(check this out). On comparing with (3-36d), one obtains the important 
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result relating H., with the entropy, 
—kpHeq = S + constant, (8-57a) 


where the constant remains undetermined within the confines of a fully 
classical theory (recall that the constant term in the formula (3-36d) is 
obtained only by referring to the partitioning of the phase space into irre- 
ducible quantum cells). We now define the non-equilibrium entropy for a 
Boltzmann gas (up to an additive constant) in terms of the H-function for 


a general non-equilibrium state (eq. (8-43)) as 
S = —kgH. (8-57b) 
In other words, the local entropy density can be defined as 


Sel Pt) = hn [dp f(e.pit In f(r, p;t). (8-58) 


The one-particle distribution function f,, of (8-52) reproduces all the ther- 
modynamic properties of a dilute gas in an equilibrium state specified by 
N,V,T, of which the entropy and the mean energy are specific instances 
(the same goes for a dilute gas in an external field where /., is given by 
eq. (8-55)). However, all these thermodynamic functions and relations 
correspond to those of an ideal gas. Indeed, the equilibrium 1-particle 
distribution function bears no trace of the inter-particle potential. On the 
other hand, the evolution of non-equilibrium states close to equilibrium 


in the Boltzmann-Grad approximation does involve the interaction poten- 
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tial (since one needs the collision integral to describe the non-equilibrium 
evolution) and one does obtain reasonably good results for the transport 


coefficients of the gas (sec. 8.3.7) in terms of the interaction potential. 


8.3.6 Hydrodynamic evolution: balance equations 


As mentioned earlier, the non-equilibrium evolution in a dilute gas pro- 
ceeds through two distinct time scales. In a short time interval of the 
order of the collision time 7., processes dominated by short range inter- 
molecular interactions (referred to as ‘collisions’) take place and the sys- 
tem approaches a state of local equilibrium where small volume elements 
chosen arbitrarily within the system remain close to their respective equi- 
librium states parametrized by locally defined thermodynamic variables. 
Over a longer time scale of the order of the relaxation time, processes of 
transport take place, wherein molecular concentration, momentum, and 
energy are transferred from one collision to the next, and the local ther- 
modynamic variables evolve slowly. This is the regime of hydrodynamic 
evolution where, however, collisions continue to play a role, especially in 
respect of entropy production, or dissipation. In a collision, the momenta 
of the two participating molecules immediately after the collision are cor- 
related with the momenta immediately before it, but the molecules fly 
apart to a large distance from each other before they participate again 
in two independent collision events, each with a new partner this time, 
and in each of these collisions the participating molecules can again be 
assumed to be uncorrelated to each other. Thus, each small volume ele- 
ment alternately evolves in and out of local equilibrium to small extents, 


with complementary roles being played by transport and collision events. 
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In the hydrodynamic regime, the local averages of the basic collisional 
invariants (refer to sec. 8.3.5, paragraph following eq. (8-48)) undergo 
a slow change, where a local average is defined as in (8-50). As men- 
tioned above, the one-particle distribution function f in the hydrody- 
namic regime remains close to a local equilibrium such as the one defined 
by eq. (8-49a). I will outline the derivation of an approximate expression 
for f in sec. 8.3.7. For now, we will see how the local averages for the 
basic collisional invariants change in time and space, while leaving the 
one-particle distribution function unspecified. These will be described in 
terms of a set of partial differential equations, referred to as the equations 


of continuity. 


Referring to a collisional invariant A(r, p;t), of which only the argument p 
is relevant in the context of a collision event, and to the local average (A) 
defined in (8-50), one can evaluate Ma) by making use of the Boltzmann 
equation for oF It may be noted in this context that the number density 
n(r,t) does not depend on p since it is defined as onl) (i.e., the number 
of molecules per unit volume around the point r at time ¢#), and is the 
local observable corresponding to the number of particles participating in 
a collision event (alternatively, one can consider the mass of the partici- 
pating particles and the local mass density); hence n(r,t) can be moved in 


and out of an integral over the momentum variable p. Note, further, that 


none of the basic collisional invariants depends explicitly on the time t. 
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Based on these observations, one obtains 


O(A)(r, t) 


at = f ¢ pate, p) PY = [ pace, p){H, A+ fa pare) (Dc 


ot 
(8-59) 


where we have made use of the Boltzmann equation and the collision 
integral (8-34) (check this result out). In the second equality here, H 
stands for the one-particle Hamiltonian pe + V(r) as mentioned earlier. 


However, the term involving the collision integral reduces to zero since, 


by definition, A represents a collisional invariant. 


The argument here is similar to the one used in the derivation of the H- 
theorem, where (8-46), (8-47) were obtained by making use of the symmetry 
properties of the scattering probability w(p, p’ > P,P’). In a similar manner 


one can replace In f with A and obtain 


f,) _ 
i a®)pAlr, P)( Se) au = / dpdp'd® Pd®)P’ A(r, p)w(p, p’ > P,P')(FF' — ff’) 
=; [papa PaP' AC, p) + A(r, p’) — A(r, P) — A(r, P’)] 


x w(p,p’ > P,P’)(ff’ — ff’) =0, (8-60) 


since A(r,p) + A(r,p’) = A(r,P) + A(z, P’), in virtue of A being a collisional 


invariant. 


On now evaluating the Poisson bracket in (8-59), making use of the defi- 
nition of a local average as in (8-50), and following a few steps of algebra, 
one obtains (see [137], chapter 2, [67], chapter 5) 


O(nA) 
Ot 


oe) — ne © po, (8-61) 


0 
+ a (nvA) —n(v- 
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where v = ®, and all the local averages, as also the local density n, are 


functions of r,t. In this formula F stands for 2” the force exerted on a 
particle at r due to the external field acting on the system. In the following 
we assume that the force is velocity-independent, in which case the last 


term in the left hand side of the above equation drops out. 


Notice that, while the formula (8-59) involves microscopically defined 
quantities, the transformed formula (8-61) contains only macroscopi- 
cally defined ones (obtained by taking local averages), the latter being 


hydrodynamic variables in the proper sense. 


One then obtains the equations of continuity (or the balance equations) 
by choosing in succession A = 1 (equivalently, A = m), A = mv; (i = 1, 2,3), 
and A = jm(v — u)?, where u denotes the local drift velocity defined 
in (8-5la), (8-51b). These correspond, respectively, to the basic colli- 
sional invariants, namely, the mass, the momentum components, and 


the thermal kinetic energy. 


A. The mass balance equation: 


On O 
(or, equivalently, ) On + cs (pu) = 0, (8-63) 
Oot Or 


(refer back to eq. (8-18)) where p(r,t) = mn(r,t) stands for the mean den- 
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sity at (r,t). 


B. The momentum balance equation: 


— oe (8-64a) 


<> 
where P stands for the pressure tensor (related to the stress tensor; the 
term ‘pressure tensor’ is used more commonly in the case of a fluid), a 


symmetric one, with elements P,; defined as 
Pig = p((vi — us) (vj — uy)) (7 = 1,2, 3), (8-64b) 


<— 
and 2 - P represents the vector with components 


r 


(2 > e 
ag BP) = Lin Pe (G=1,2,8); (8-64c) 


x; (i = 1,2,3) being the Cartesian components of the position vector r. 


C. The energy balance equation: 


Oe O Oe 
OF oe po 
where 
1 2 
e= 5Plv —u)’, (8-65b) 
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stands for the energy density, and a term (F- 2) has been dropped by 
making use of the fact that the force is velocity-independent and (v—u) = 


0. 


This equation can be written in a more useful form by making use of the 


first equality in (8-51a), and defining the heat current density 
q=n((v —u)e), (8-66a) 


and the rate of strain tensor (again a symmetric one) T with Cartesian 


components 


(8-66b) 


The desired form of the energy balance equation can then be seen to be 


O O 2m 2m vans 


where the mass- and momentum balance equations have been made use 


of (check this out; see [67], chapter 5). 


1. The operator (2 +u- 2) is referred to as the ‘convective derivative’ and 
signifies the temporal rate of change in a frame of reference sharing 
the mean motion of the particles in a small volume around any given 


point r. 


2. The balance equations stated above correspond to conservation prin- 
ciples and are also referred to as equations of continuity. While their 
derivations have been outlined in the context of the Boltzmann equa- 


tion, they are of more general validity and express the fact that the 
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amount of mass, or of any of the Cartesian components of the total 
momentum, or of the relative kinetic energy (in the local center of mass 
frame) in a small volume changes at the same rate at which it flows 
into this volume across the surface bounding it, added to the rate, if 
any, at which the amount in question is produced locally, which is the 
case of the momentum density produced by the action of an exter- 
nal field of force (refer back to sec. 8.2.3.2 where the general form of 
the balance equations in the case of a simple fluid). For our present 
purpose, all these amounts are to be interpreted in the sense of local 
averages as defined in (8-50), and depend on the one-particle distri- 
bution function f(r,p;t), where the latter is required to vary slowly in 
space and time (hydrodynamic evolution). Recall that we are inter- 
ested in a regime where the system is close to a configuration of local 
equilibrium, i.e., the one-particle distribution function differs to only 
a small extent from an expression of the form (8-49a), and where the 
local densities of mass, momentum, and relative kinetic energy vary 
slowly by means of transport, which owes its origin to the streaming 
term in the Boltzmann equation. The process of transport is, however, 
accompanied with local dissipation since the one-particle distribution 
function differs slightly from the local equilibrium form, as a result of 


which the collision integral has a small non-zero value. 


3. The pressure tensor P, with Cartesian components P;,; (i,j = 1,2,3) 
gives the rate at which the ith component of momentum is transported 
in the jth direction. In the equilibrium configuration in a fluid, this is 
completely specified in terms of the pressure as pé;;, where the pres- 
sure is related to the temperature T by the equation of state p = nkpT. 


More generally, one can decompose the pressure tensor as 
Pig = pig + (p+ 0’) big, (8-68) 


where p;; (referred to as the shear tensor) represents the symmetric 
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& 
part of P with zero diagonal elements, p = 5), Pi; stands for the 
hydrostatic pressure, and p’ is referred to as the ‘pressure deviator’ 


which, however, is not of relevance for a dilute gas. 


4. The balance equations (8-63), (8-64a), (8-67) stated above do not form 
a closed system. The equation for density depends on the velocity field 
u(r,t), the equation for the momentum density (or, equivalently, of the 
velocity field) depends on the pressure tensor, and the equation for 
the relative kinetic energy density (which is related to the trace of the 
pressure tensor) depends on the heat current. These will be converted 
to a closed set by bringing in the transport coefficients, the latter be- 
ing related to the kinetic coefficients that tell us how the macroscopic 


fluxes depend on affinities. 


5. Analogous to the balance equations stated above, one can also speak 
of the entropy balance equation, where there is, in general, a non-zero 
entropy production term. The latter, however, does not owe its origin to 
an external force field but arises due to local dissipative processes. The 
strength of these processes is described by the transport coefficients. 
The latter are incoprporated in hydrodynamics as phenomenological 
constants, but are derived from more fundamental considerations in 
kinetic theory and statistical mechanics. We will outline the derivation 
of the transport coefficients in sec. 8.3.7 below in the context of the 
Boltzmann equation. More general derivations will be outlined later in 


the book (sec. 8.4.11.3). 


The derivation of the entropy balance equation is accomplished by 
putting A = In f in (8-59) (note that, in contrast to the conserved quan- 
tities, Inf is not an exact collisional invariant, but is only approxi- 
mately so in the present context), which yields, on making use of the 


symmetry properties of the scattering probability w, 


0 
= + apr =Oh, (8-69a) 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


where the H-function density h, the H-function current density j,,, and 


the H-function source density o;, are defined as 


: i 
noest) = f dpping, in= = fd ppsing. 


1 / 
Th=5 / d?) pd) p'd) Pd® P'w(p, p’ > P,P’) In fF 


prt Jy 


(8-69b) 


One can rewrite eq. (8-69a) in terms of the local entropy density (s = 
—kgh), entropy current density (j, = —kpj;,), and entropy source density 
(or entropy production; 0, = —kgo;; this was denoted by ¢ in sec. 8.2.2) 


as 
mae ee -js = Os, (8-69c) 
where, by the third equality in (8-69a), one has, 
ds > 0, (8-69d) 
in a non-equilibrium configuration. 


It may be remarked that the local densities n (or, equivalently, p), u, 
and e are all mechanical variables while s is a thermodynamic one. 
The entropy balance equation (8-69c) remains only a formal one un- 
less the thermodynamic relevance of s is specified. As we have seen, 
the equilibrium one-particle distribution function f encodes the en- 
tire thermodynamics of a dilute gas and one can, by analogy, assume 
that the non-equilibrium distribution function f describes the non- 
equilibrium thermodynamics of the gas. In particular, in the hydrody- 
namic regime, the non-equilibrium evolution is entirely described by 
the balance equations of the local density, the local momentum den- 
sity, the local energy density, and the local entropy density. Of these 


the balance equation for the entropy density can be assumed to be of 
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general validity even beyond the limits of the hydrodynamic regime. 


6. As mentioned above, the balance equations stated here are special 
instances (in the context of the Boltzmann equation) of the ones enu- 


merated in sec. 8.2.3.2 for a simple fluid. 


8.3.7 Hydrodynamic evolution: transport coefficients 
8.3.7.1 The ideal fluid approximation 


In converting the balance equations into a closed set we start from an 
approximation where all the local averages are calculated with the local 
equilibrium form (8-49a) of the one-particle distribution function. This 
form, however, does not actually satisfy the Boltzmann equation because 
of the streaming term, in consequence of which one obtains only an ideal- 
ized version of hydrodynamics, namely, that describing the motion of an 
ideal fluid, where dissipative effects are absent and all the transport coef- 
ficients are zero. We correct for this shortcoming of the local equilibrium 
assumption by adding a correction term to (8-49a) in sec. 8.3.7.2 such 
that the Boltzmann equation is satisfied to a better degree of approxi- 
mation. A systematic way to do this is to follow the Chapman-Enskog 
scheme while we will adopt the simpler and more intuitive approach of 


the relaxation time approximation. 


Deferring this to the next section (Sec. 8.3.7.2), we confine ourselves for 
now to the assumption of local equilibrium in which case the pressure 


tensor P;; and the heat current work out to 


Py = pb, p = nkpT, q=0 (8-70) 
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where the number density (n; the mass density is given by p = mn), pres- 


sure (p), and the temperature (7) are locally defined variables. 


In the local equilibrium approximation (this may be viewed as the ‘zero or- 
der’ approximation in kinetic theory) the balance equations stated above 


(sec. 8.3.6) assume the following forms. 


: 7 Op oO _ 
[ideal fluid : mass balance} AE + a (pu) = 0; (8-7 1a) 
eae: 0 | 0, p Op 
[ideal fluid : momentum balance] Ger tu: 5,4 = mae — Be (8-7 1b) 
; : O a) 2m 
[ideal fluid : energy balance] Ce + U- pe 3 pV -u=0. (8-7 1c) 


Note that the mass balance equation remains unchanged when compared 
with eq. (8-63), while the other two balance equations get simplified in 
virtue of (8-70). 


In the context of this zero order approximation, there is no room for heat 
flow and viscous resistance, implying that dissipation is absent. This is 
equivalent to the statement that, with the local equilibrium form of the 
one-particle distribution function, ff’ — ff’ = 0, and hence the entropy 
production o, = 0 (refer to the third equality in (8-69b) and the definition 


Os; = —kpo;). 
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The ideal fluid equations are useful in various important contexts. For in- 
stance, these give the equation for sound propagation without absorption 


and a good approximation to the velocity of sound in a fluid. 


8.3.7.2 Transport coefficients: the relaxation time approximation 


We begin by recalling a few results from the elementary theory of trans- 
port in an ideal gas (see [101], chapter 16) where the zero order approx- 
imation for the one-particle distribution function is used while, at the 
same time, dissipative transport is introduced in an intuitive manner by 
taking into account the momentum and energy transfer between volume 
elements by means of molecules in their free flight between successive 
collisions. The length of the free path is assumed to be \ which is worked 
out separately by calculating the mean separation between successive 
collisions in a hard sphere gas in which the Maxwellian velocity distribu- 
tion is assumed to apply. One obtains, in this elementary kinetic theory 
approach, the following results for the thermal conductivity («) and shear 


viscosity (7) of a dilute monatomic gas: 


3 
i k2T2 
K = <nkpdti = 2, (8-72a) 
2 T2202 
1 omtk2T3 
7 = =mnXr\t = tel aba = : (8-72b) 
3 37207 


where o stands for the collision diameter (i.e., the diameter of a molecule, 


assumed to be a hard sphere), \ = is the mean free path, and 


I 
VJ/2nr02 


c= \/ Be is the mean speed of the molecules. These results imply the 
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important fact that the thermal conductivity and the shear viscosity are 
both independent of the pressure at any given temperature, which is in 
accord with experimental observations. The temperature dependence for 
both the coefficients is seen to be T?, while observed data reveal a power 
law dependence somewhat different from T?. As for the pre-factors in the 
expressions for « and 7, these are not in agreement with observations. In 


particular, the ratio of the two constants works out to 


see (8-72c) 
Hn 2m 
which differs from the experimentally observed value 
1 
oe (8-73) 
" 4m 


Elementary kinetic theory also provides one with an estimate of the dif- 
fusion coefficient of a dilute gas. Diffusion is the phenomenon wherein 
inhomogeneities in the macroscopic number density n(r,t) get evened out 
in a gas maintained in a closed volume and is described by the diffu- 
sion equation in which a constant D, the diffusion coefficient, appears. 
However, the microscopic theory leading to an estimate of the diffusion 
constant from fundamental considerations needs special considerations, 
because one needs to distinguish the diffusing molecules from the back- 
ground through which diffusion occurs. In other words, in the case of 
a one-component gas, one needs to consider the random motion of a 
marked, or tagged, molecule through the background provided by the 
other molecules of the gas. Alternatively, one considers a two-component 


gas in which the diffusion of one component against the other can be 
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examined. The diffusion coefficient obtained in the former situation is 
referred to as the coefficient of self-diffusion. The self-diffusion coeffi- 
cient can also be obtained by first considering a two-component gas and 
then identifying the two components (see sec. 8.3.7.3). In the elementary 
approach mentioned above, one obtains the following expression for this 


coefficient: 


’ (8-74) 


where one observes that the diffusion coefficient is proportional to VT 


and inversely proportional to the particle density. 


The elementary theory alluded to above has several shortcomings 
(see [20], chapter 6) as far as agreement with observations goes. For 
instance, the temperature dependence of the transport coefficients, 7 
ratio, and the pre-factors in the expressions obtained for the trans- 
port coefficients, come out wrong in this approach. The theory as- 
sumes all free paths to be the same, and relies implicitly on a hard 
sphere model. As such, it does not take into account such factors as 
the fluctuations of the free path and the details of the inter-molecular 
interactions. Importantly, the elementary theory does not explain 
cross-effects between thermal and diffusional processes observed ex- 


perimentally. All these shortcomings are remedied in the Chapman- 


Enskog theory to be outlined in sec. 8.3.7.3. 


A more sound approach to the calculation of transport coefficients is to 


start from the zero order approximation for the one-particle distribution 
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function (eq. (8-49a), we call this f)), which fails to satisfy the Boltzmann 
equation because of the streaming term, and then to add to it a small 


correction (say, 6f) so that the resulting function 
f= f +f, (8-75) 


satisfies the Boltzmann equation to a better degree of approximation. On 
evaluating the local thermodynamic variables with this improved one- 
particle density, and expressing the balance equations in the form of lin- 
ear relations between thermodynamic fluxes and forces, one can identify 


the transport coefficients of interest. 


We will first evaluate the correction 6f in a simple approximation scheme, 
referred to as the ralaxation time approximation (also referred to as the 
Bhatnagar-Gross-Krook, or the BGK approximation; see [137], [67] for 
details), which makes use of the fact that the rate of change of the one- 
particle distribution function as represented by the collision integral is 
entirely due to the correction 6f (recall that the collision integral vanishes 
for f), and that, starting with f at a given instant of time in the course 
time evolution, one ends up with the local equilibrium form f) in an 
interval of the order of 7,, the ralaxation time (which we assume to be a 
constant, approximated by the mean time between collisions). In other 


words, one has 


(ef of (8-76) 


ae fi a 
With this simple and rather crude approximation, we now evaluate 6f by 
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requiring that the corrected one-particle distribution function satisfy the 
Boltzmann equation up to terms of the first order of smallness. A few 


steps of algebra (which I skip) then gives (see [137]) 


of = =f [atv u) Vi as _ u)” = °) 
1 
+ pL Uil(w— (oj — uy) — gv w*5y)], B77 


where, as before, u denotes the local average of v so that v — u stands 
for the local thermal velocity, and U;; (i,7 = 1,2,3) stands for the rate of 


strain tensor defined earlier (eq. (8-66b)). 


With this correction applied to f one can now evaluate the local av- 
erages of the mass density, momentum density, and the kinetic energy 
density, as also the heat current density and the momentum current 


density. One obtains, in particular, 


q=—«VT, (8-78a) 


where «, the thermal conductivity, is given by 


5 
c= 5 nTkel, (8-78b) 
and 
1 
Pig = —2n(Uiz — 3 Oui V u), (8-78c) 


where p;; is the shear tensor of (8-68), and the constant 7 is seen to work 
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out to 


n=n7,kel, (8-78d) 


On comparing with Newton’s formula for viscosity, one finds that 7 stands 
for the shear viscosity of the gas (check this out). The values of the trans- 
port coefficients obtained here can be compared with those obtained in 


the elementary kinetic theory derivation (eq. (8-72a), (8-72b)) by taking 


A 


as an order-of-magnitude estimate, 7, = ¢, when one obtains 


34 Best 
bee Ts 2k2T 2 

po ee ge (8-79a) 
8r2m2o2 Anza? 


assuming a hard sphere model. One again observes that the two trans- 
port coefficients are independent of the pressure at any given tempera- 
ture, while the temperature-dependence is again ~ T? for both, analogous 
to the elementary kinetic theory results. However, the pre-factors differ 
from the ones appearing in the expressions (8-72a), (8-72b), and the ratio 


of the two works out to 


£0 (8-79b) 
Hn 2m 


which still differs from the correct, Chapman-Enskog value of kB The 


ratio = is closely related to the Prandtl number for a fluid, 
Pr = 2" (8-80) 
which is a characteristic dimensionless number of considerable relevance 
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in the comparison of flow characteristics of fluids of diverse descriptions. 


In the above expression, c, is the specific heat heat at constant pressure 


Bom 


aa (Cp = molar specific heat at constant pressure, A= Avogadro nun 


per unit mass, i.e., c, = 


whose value for an ideal monatomic gas is ae With the ratio as given 


by (8-79b), the Prandtl number works out to Pr = 1, while the Chapman- 
Enskog value for the ratio ; yields Pr = + for monatomic gases with 
spherically symmetric interaction potentials, which is in much better ac- 


cord with observed values. 


The transport coefficients « and 7 in formulae (8-78a), (8-78c) do not appear, 
strictly speaking, as proportionality constants between thermodynamic fluxes 
and affinities as introduced in sec. 8.2.2, but are related to the kinetic co- 
efficients defined there. In the case of a dilute gas whose non-equilibrium 
evolution is described by the Boltzmann equation, there does not occur any 
mutual interaction between heat flow and momentum flow, i.e., the two 
occur independently of one another. This implies that there are only two 
kinetic coefficients L, and Lo, i.e., no mutual coefficient (such as L192) is 


involved, and «, 7 can be expressed in terms of these two. 
On correctly identifying the thermodynamic affinities and fluxes in heat flow 


and momentum flow in a Boltzmann gas, one obtains the following expres- 


sion for the entropy source density o,: 


K 2n 1 2 
C= aaglVIY +z ye (Vij — 35:jV - ) (8-81) 


(check this out; see, for instance, [81], chapter 3 and chapter 8). 
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8.3.7.3 The Chapman-Enskog theory: a brief introduction 


The Chapman-Enskog method is a systematic approach for solving the 
Boltzman equation wherein one obtains a special class of solutions, termed 
‘normal solutions’ in a perturbative manner, a normal solution being one 
where the one-particle distribution function depends on r,t, only through 
the local variables n, u, T (in a notation by now familiar). The small pa- 
rameter in the expansion is the Knudsen number Kn = + << 1, which 
is small in the case of a gas for which the thermodynamic parameters 
(such as the particle density, momentum density and temperature) vary 
appreciably only on a length scale L (which is trivially the case if the gas 
is confined in a region of linear dimension L) large compared to the mean 
free path \. This corresponds physically to a situation for which colli- 


sions are relevant in determining the state of the gas, while the condition 


Kn >> 1 implies that collisions are rare and irrelevant. 


Thus, we now have three sets of conditions to work with: first, the Boltzmann- 
Grad conditions (8-26) are the ones necessary for the Boltzmann equation 
to hold and to describe the non-equilibrium evolution of the gas. The con- 
dition 7. << 7, defines the hydrodynamic regime in which the state of the 
gas is close to one of local equilibrium, i.e., a small (but macroscopic) vol- 
ume element chosen arbitrarily within the gas can be said to be in a state 
of equilibrium, which changes slowly in space and time as collisional in- 
variants are carried by molecules in their free flight from one region to 
another. And finally, the condition Kn << 1 is necessary to ensure that 
solutions to the Boltzmann equation having a number of desirable fea- 


tures can be obtained by series expansion. 
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The question of analyticity of the solution to the Boltzmann equation 
near a local equilibrium form is, from the mathematical point of view, 
a subtle one, and the Chapman-Enskog method is designed to obtain 
a series expansion for a solution belonging to a certain class. De- 
tails of the chapman-Enskog approach are to be found in [16], [17], 
[81], and [20]. In the present section I will, following [81], section 3.2, 
and [101], chapter 19, present in bare outline and without derivation, 
a simplified version of the theory where only the first order correction 
to the leading term (i.e., the zero order term describing an ideal fluid, 
as mentioned above) is included and the values of transport coeffi- 


cients resulting from it are stated. 


We consider only the first two terms in the expansion of the one-particle 


distribution function f as 


f= f+ fe, (8-82) 


where /!" corresponds to the zero order approximation describing an ideal 
fluid and f!! stands for the first order correction to f'!, to be worked 
out so as to obtained the corrected one-particle distribution function f. 
We express f/f!!! in the form f!!(1+ ¢), where ¢(r, p;t), of the order of the 
Knudsen number, represents the deviation from the zero order distribu- 
tion function. Substituting in the Boltzmann equation and noting that 


ae — {H, f\ (H = one-particle Hamiltonian) is expected to be small, one 


finds that the deviation ¢ has to satisfy, in the leading order of approxi- 
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mation, the integral equation 


do \p — afi! 


P| coi lie gi a ht OF [0] 
cee Tea as (¢+¢-o-¢)= a {H, f™}, 


(8-83) 


—n?I[¢| = [e®v'a0 


(check this out; ¢,¢',¢’ are defined in terms of f, f’, f’ in a manner anal- 
ogous to the way ¢ is defined from f).The existence of a normal solution 
to this integral equation is guaranteed by the existence of a solution to 
the ideal fluid equations determined by f"!, the normal solution that we 


started with, for which one has 


af aflan , af apo , afar 


py OF OF oe ro ee) 


ot On Ot 


By making use of the ideal fluid equations, the integral equation for ¢ can 


be written as 


5 OmnT Ou, 
—n?I|d] = —f [(W? — 2)(v —u)- x + Lbgeb (8-85a) 


where velocity variables are used instead of momenta for the sake of con- 
venience of reference and where the following notation has been made 


use of: 


Ls Ley 7 
is Vee —u), bi = (WW) — Wi). (7 =1,2,3). — (8-85b) 


The solution to this linear inhomogeneous integral equation is sought in 
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the form 


or, pst) = — 24/2 aw. 2? lay) 5,24, 8-86) 
n vj 


where A(W), B(W) are scalar functions that can be seen to satisfy two 
independent integral equations when one substitutes (8-86) in (8-85a). 
These can be solved by expanding A(W), B(W) in two series involving sets 
of orthonormal polynomials (the Associated Laguerre polynomials, also 
referred to as Sonine polynomials) and retaining an appropriate num- 
ber of terms (commonly, only two) in the expansion. The solutions for 
A(W), B(W) can be expressed in terms of non-trivial collision integrals 
involving detailed features of the collision dynamics for a pair of colliding 
molecules. In other words, in the Chapman-Enskog theory, the correc- 
tion to the zero order distribution function can be explicitly related to the 


interaction potential. 


Once the corrected one-particle distribution function is in place, one can 
work out the various local currents and fluxes, and then the transport co- 


efficients. One obtains, for the thermal conductivity and shear viscosity, 


_ 15k mkpl = i) mkpl 1 
— 64m Voor ~ 16 x eo7) 


which constitutes leading approximations in the Chapman-Enskog the- 


the expressions 


ory. In these expressions, 2 stands for a collision integral depending 
on the inter-molecular potential, and also on the temperature (see [81] 


for further details). In the case of a hard sphere interaction, one ob- 
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tains 2 = o”, which gives results analogous to those in elementary kinetic 
theory (formulae (8-72a), (8-72b)) as also in the relaxation time approx- 
imation (eq. (8-79a)) though, as mentioned earlier, the pre-factors differ, 


implying the correct value (48) of the ratio fl 


Additionally, the temperature-dependence of the transport coefficients 
gets modified (as compared to the hard sphere values) due to the factor 
ne While the hard sphere potential (infinitely repulsive for inter-molecular 
separations less than or equal to o) gives a VT dependence, a linear de- 
pendence is obtained for an inverse fourth power repulsive potential (i.e., 
inverse fifth power force law) investigated by Maxwell, this being com- 
monly referred to as the Maxwell potential. Observed data are not in 
full agreement with either of these models, and point to inter-molecular 


interactions of a more complex nature. 


The Chapman-Enskog theory also leads us, in the case of a two-component 
gas, to the expression for the diffusion coefficient of either of the two com- 
ponents in the background of the other. If, in the resulting expressions, 
one identifies the parameters (such as the molecular masses and the col- 
lision diameters; the number density is to be taken as twice the number 
density of each component in the mixture) then one obtains the self- 
diffusion coefficient D introduced in sec. 8.3.7.2, but with a correction 


factor depending on the intermolecular potential: 


_ 3 fig 1 
~~ 16n V rm OQ” 


(8-88) 
where ()’, analogous to 2 in (8-87), can be expressed in terms of a collision 
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integral involving the intermolecular interaction, defined in such a way 
that, in the case of the hard sphere model, its value reduces to o?. For 


further details regarding 2,’ refer to [81], chapter 3, chapter 8. 


Having learnt about the corrected one-particle distribution function in 
the Chapman-Enskog theory and the transport coefficients in the leading 
approximation in this theory, one can work out the balance equations 
with the corrected distribution function, when one finds that the mass 
balance equation remains unchanged in form, while the momentum bal- 
ance and energy balance equations get modified by the inclusion of the 
dissipative terms arising due to the non-zero values of the transport co- 
efficients. Skipping the intermediate steps of algebraic manipulations, 


these are seen to be of the following forms: 


[Chapman — Enskog : momentum balance] The Navier — Stokes Equation 


a a Bp 
Pla tua ar 


+ 7V7u+ iG -u), 


_Pp_ 
are 3 Or 


r 


(8-89a) 


2 2 
Ts 5 PV -u—SKV'P =0. 


[Chapman — Enskog : energy balance] o(S. + U- 
(8-89b) 


The three balance equations mentioned above are also known as the 
constitutive equations for a one-component material system. One 


further relation, known as Fick’s law relates mass flow to concentra- 
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tion gradient, and leads to a fourth constitutive equation, 


Op 2 

— = DV 8-90 
where D stands for the diffusion coefficient. However, as mentioned 
earlier, the microscopic theory of diffusion is more meaningfully for- 
mulated in the setting of a multi-component system (or else in terms 
of the motion of a tagged particle in a single-component system) 


where, strictly speaking, diffusion is caused by a gradient in the 


chemical potential of a component. 


Recall that the balance equations stated here are derived for a Boltz- 
mann gas, and need modification for dense fluids. For instance, the 
Navier-Stokes equation (the equation of motion for a viscous fluid) can be 
derived under general considerations (without reference to kinetic theory) 
relating to conservation of momentum, and has to be modified by adding 
aterm ¢ Z(V -u) to the right hand side of (8-89a), where ¢ is stands for the 
bulk viscosity of the fluid under consideration (refer back to sec. 8.2.3.2). 
Kinetic theory or statistical mechanics is called into play when one needs 
to relate the transport coefficients to more fundamental considerations 
involving inter-molecular interactions. For instance, as we have seen 
in the present section, the Chapman-Enskog theory can be resorted to 
for deriving the balance equations, including the Navier-Stokes equation, 
and also for deriving expressions for the transport coefficients from mi- 


croscopic considerations. 


Of course, the balance equations (8-89a), (8-89b) are obtained only in the 
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leading approximation in the Chapman-Enskog theory, and one can go 
further with the series expansion for the one-particle distribution func- 
tion. This, however, is a daunting task since the calculations become 
enormously complex as one goes to higher orders. While we have consid- 
ered the first order correction to the zero order ideal-fluid theory, the sec- 
ond order correction has also been worked out. In this order, the Navier- 
Stokes equation gets modified to what is referred to as the Burnett equa- 
tion — one of much greater complexity, containing higher derivatives of 
the thermodynamic variables and higher powers of the lower derivatives. 
In other words, the Navier-Stokes equation (augmented by the bulk vis- 
cosity term) is not the most general form of the momentum balance equa- 
tion for a fluid (nor is (8-89b) the most general form of the energy balance 
equation). As one goes to denser systems, the balance equations get 
transformed dramatically owing to mutual effects in transport, as men- 
tioned in sec. 8.2.2. On the other hand such mutual effects are also 
seen to appear in the Chapman-Enskog theory for a multi-component 
Boltzmann gas where, interestingly, there arises a mutual effect between 
thermal and material flows, leading to the phenomena of thermal diffu- 


sion and diffusional heat flow. 
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8.4 Non-equilibrium statistical mechanics: the 


linear reponse theory 


8.4.1 Linear response: introduction 


Analogous to the thermodynamics of irreversible processes, non-equilibrium 
statistical mechanics in the linear regime assumes that the system un- 
der consideration is close to an equilibrium state where the term ‘close to’ 
implies the validity of the assumption of a linear relationship between the 
force responsible for the deviation from equilibrium and the response of 
the system to that force. This will be made more precise in the following 


paragraphs. 


The Boltzmann equation, which describes the non-equilibrium evolution 
of a dilute gas, is limited in scope since the results derived from it cannot 
be fruitfully extended to dense systems, though it generates a great deal 
of insight into non-equilibrium processes in general. An approach com- 
plementary to that of the Boltzmann equation and vastly general in scope 
is based on the relation between response and correlations in evolving sys- 
tems in the linear regime, regardless of their constitution. The basic idea 
was mooted by Onsager in the form of the celebrated regression hypoth- 
esis that subsequently culminated in the fluctuation-dissipation theorem. 
This approach, commonly referred to as the linear response theory ap- 
plies to condensed systems and constitutes the most well-studied theory 
of a general nature in non-equilibrium statistical mechanics, analogous 
to the use of the standard ensembles in equilibrium statistical mechan- 


ics. Indeed, linear response theory makes essential use of the equilibrium 


1009 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


ensembles, among which we will focus on the canonical ensemble in the 


present chapter. 


The Boltzmann equation in itself is not confined to a description of the evolu- 
tion of the system in the linear regime. However, the results stated above in 
respect of the transport coefficients, obtained in the relaxation time approx- 
imation and the first order Chapman-Enskog approximation, are all based 
on a linear relationship between the thermodynamic fluxes and forces. In- 
deed , the transport coefficients are defined within the limits set by such a 


linear relationship. 


In this chapter, we outline the fundamentals of linear response theory 
with reference to classical systems and will also formulate the corre- 
sponding principles in the more general quantum context. All the ba- 
sic results of the theory involve averages of observables evaluated over 
equilibrium ensembles, and their Fourier or Laplace transforms, and of- 
ten these appear to be formally similar in both classical and quantum 
considerations. One has to remember that, while some particular result 
is stated within the classical framework, a corresponding result holds in 
the quantum theoretic framework as well. At times, this correspondence 


will be pointed out explicitly. 


Notation. As in previous sections, we consider the phase space 
of a N-particle system (V may be as small as 1, as in the 
case of a single simple harmonic oscillator, which is often re- 


[N] 


ferred to). We use the notation r'“! = {rj,ro,---,ry} for the 
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collection of position co-ordinates of the particles while p!*) = 
{Pi, P2,:::, Px} has a similar significance in respect of the mo- 
mentum co-ordinates. These will at times be further abbrevi- 
ated to R = r'™! and P = pl"! respectively (cf. section 1.3.2.1 
where the abbreviations Q, P were introduced). The symbol 21! 
will stand for the collection of 2N number of vector co-ordinates 
ZN] = (rlNI pl%!). A volume element in the phase space will 
be denoted by d!*z!%! = d®r,---d@ryd@ p, ---d® py while, writ- 
ten outside an integral sign, this will be denoted as 6!%!z\%] = 
6°r,---63 py. The super-index [N] will, at times, be omitted 
when there is no scope for confusion; thus d%lz = d!%)z!%1, 
Since, generally speaking, we will consider systems of identi- 
cal particles, a factor ste will have to be used while evaluat- 
ing a phase space integral (refer back to sec. 1.3.2.2). Discrete 
phase space co-ordinates, if any, such as spin components will 
be mentioned separately as necessary. A hat over a symbol will 
generally be used to denote an operator representing a quan- 


tum mechanical observable. 


After outlining how the time evolution of a system is described in the clas- 
sical and quantum formulations (sec. 8.4.2 below), we begin the analysis 
of non-equilibrium systems in the linear regime by explaining the re- 
gression hypothesis in sec. 8.4.3 and then go on to state and explain 
the fluctuation-dissipation theorem (sections 8.4.4, 8.4.6). As mentioned 
above, the non-equilibrium time evolution in the liner regime deals with 


the response of a system to a small perturbation imposed over an equi- 
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librium configuration obtaining under specified constraints, by which we 
mostly mean specified values of the parameters N,V,7, corresponding to 
the canonical ensemble, while grand canonical ensembles are also of rel- 
evance. We introduce the idea of the response function and its Fourier 
transform, the dynamic susceptibility, and state a number of character- 
istic features of these functions. The fluctuation-dissipation theorem re- 
lates the response functions to time correlation functions, whose Fourier 
transforms are referred to as dynamic structure factors. The experimen- 
tal implications of the relation between dynamic susceptibilities and dy- 
namic structure factors will be pointed out. Finally, we indicate how one 
can relate the transport coefficients of a system to current correlation func- 
tions where the latter, like all other time correlation functions featuring 


in the linear response theory, are averages over equilibrium ensembles. 


8.4.2 Time evolution: the Schrodinger and the Heisen- 


berg type descriptions 


The time evolution of states and observables in the quantum context can 
be described either in the Schr6é dinger or in the Heisenberg pictures, 
while the interaction picture is also made use of when dealing with inter- 
acting systems (see below). It is worthwhile to note that the Schrédinger 
and the Heisenberg type descriptions are relevant in the classical context 


as well. 
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8.4.2.1 Time evolution: classical 


In the classical Schrédinger type description the state of the system under 
consideration is represented by a probability distribution p = p(z!!) in the 


phase space that evolves in time in accordance with the Liouville equation 


> = ~1e, H} = —iLp, (8-9 1a) 


where the Poisson bracket { A, B} between two phase space functions A, B 
is given by 


OA OB OB OA 


{4.3} = 57° Spit ~ By" pit (8-91b) 


and the action of the operator £ = iL on a phase space function A(rl!, p!!) 


is defined as 


LA(r], pINl) = -if A, H}. (8-91c) 


The operator L whose action on a phase space function gives LA = {H, A}, 
is referred to as the Liouville operator. For an outline of the derivation of the 


Liouville equation, refer back to sec. 8.3.3; see also [140], chapters 2, 13. 


In the above formulae H stands for the Hamiltonian of the system. The 
reason why the facor i is included in (8-91a), (8-91b) is that the oper- 
ator £, defined this way, turns out to be a self-adjoint one [140] in the 


Hilbert space of (appropriately defined) functions, including observables 
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and probability distributions, on the phase space, where the inner prod- 


uct of two such functions is defined as 


a= if dl Lagt (2) y(2l), (8-92) 


The boundary conditions to be satisfied by the functions under consid- 
eration are that these are to go to zero as the magnitude of any of the 
position vectors (r,,r2,---) or the momentum vectors (p;, p2,---) goes to 


infinity. 
It then follows that the distribution function p evolves in time as 
p(z] t) = e*“*p(zl"! 0), (8-93a) 
or, in brief, 
p(t) = e-"*"p(0), (8-93b) 


where the initial time has been set to zero but can be chosen arbitrarily 
on the assumption that the Hamiltonian does not depend explicitly on 
time. The operator e~“' in the Hilbert space of functions on the phase 


space is a unitary one, as a result of which one has the equality 
(f dMeprerety) = [ avers, (8-94) 


for any two appropriately defined functions ¢, 7) on the phase space. 


The more general case of non-Hamiltonian evolution will be briefly outlined 
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in sec. 8.4.5. Non-Hamiltonian time-evolution occurs in the case of dissipa- 
tive systems, modeled as thermostated ones (see sec. 9.9) where, however, 


the restriction of linearity of response is not an essential one. 


In this Schrédinger type description of time evolution in the classical con- 
text, an observable, defined in terms of a real-valued function A(z!‘!) on 
the phase space does not evolve with time (assuming that the function A 
does not depend explicitly on t), and the average taken over the distribu- 


tion function p at time t is given by 
(A) (£) = f ateael pce", i), (8-95) 


where p(2!%!,t) is obtained from p(z!*!,t = 0) by the action of the evolution 


operator e~“. 


On the other hand, a Heisenberg type description is also possible in the 
classical context where the probability distribution p is assumed to re- 


main unchanged in time while the observable A evolves as 
A(z", t) = A(2!™(t); 0), (8-96) 


where a phase space point z!‘! (at time t = 0) is assumed to get shifted 
to z!l(t) by the classical dynamical equations. Equivalently, A(z;t) is 
obtained from A(z; 0) (we abbreviate z!! to z) by the action of the operator 


LLt 
eft | 


A(z;t) = e“* A(z; 0), (8-97) 
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(reason this out). 


In this Heisenberg mode of description one has 


(A)(t) = / dz A(z! 4) (21). (8-98) 


The two expressions for (A)(t) obtained above turn out to be the same, 
as can be seen by making use of the unitarity of the operator e~“*. More 
directly, this can be seen by a change of the integration variables in (8-98) 


and the use Liouville’s theorem mentioned in sec. 8.3.3 (check this out). 


In summary, the time translation of a probability distribution p(z) 
and that of a phase space function A(z) representing an observable 
are realized, in the Schrédinger and the Heisenberg type descriptions 


respectively, by means of the operators e~*“' and e“': 


e~*‘p(z;0) = p(z;t) = p(z(—t);0); e** A(z; 0) = A(z; t) = A(z); 0), 
(8-99) 


8.4.2.2 Time evolution: quantum mechanical 


While states (in general, mixed ones) and observables appear as functions 
on the phase space in the classical description, these appear as operators 


in the quantum context. 


In either of the classical and quantum descriptions, the set of observables 
have the structure of an algebra, and the states constitute mappings from 


the algebra of observables to the set of real numbers [121]. 
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In the Schrédinger picture, a state described by the density operator 


(commonly referred to as a density matrix) p(t) evolves in time as 


A(t) = e® #40) B(ty)en ttt), (8-100a) 
In this context, the unitary evolution operator U(t,t)) defined as 
U(t, tg) = ee At) (8-100b) 


specifies the evolution of the density operator from time ft, to t (it depends 
on t,t) only through t — to if the Hamiltonian H does not depend explicitly 
on time). On the other hand, the operator A representing an observable 


does not evolve with time (unless it involves an explicit time dependence). 


In other words, the operator (say, A) representing an observable that does 
not depend explicitly on time remains unchanged under time evolution in 
the Schrédinger picture. The average value of the observable at time t is 


then given by 

(A)(t) = Tr(U(t, to) A(to) U"(t, to) A). (8-101) 
The unitary evolution operator U(t,to) satisfies the operator differential 
equation 


ia) ee 


in— TY = HU (to), (8-102) 


with the boundary condition U(to,t)) = /, the unit operator, while the 


differential form of the evolution equation (8-100a), referred to as the 


1017 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


Neumann (or, at times, as the Neumann-Liouville) equation, reads 


dp 
h— =/H, pl. -l 
2 ai [H, | (8-103) 


In the Heisenberg picture, on the other hand, p does not evolve in time 


while A evolves as 
A(t) = U(t, to) A(to) U (Et, to), (8-104) 
and one obtains 
(A) (t) = Tr(AU (t, to) ' A(t) U(E, to)), (8-105) 


which agrees with the expression in (8-101) because of the cyclic property 


of trace (check this out). 


Quantum theory often makes use of a third, interaction, picture which 
is, in a sense, intermediate between the Schrédinger and the Heisenberg 
pictures, and turns out to be of great value when the Hamiltonian is of 


the form 
H=HA+V, (8-106) 


where Hj) commonly appears as the Hamiltonian of an assembly of free 
particles or of a system in the absence of external influence (or else rep- 
resents a part of H for which the solution to the eigenvalue problem is 


known; in any case A) is often termed the ‘free’ or ‘intrinsic’ part of H for 
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the sake of easy reference) while V represents an interaction between the 
particles or an interaction with an external system whose presence poses 
a challenge to the solution of the eigenvalue problem (and hence to the 


evolution problem) for H. 


Given a Hamiltonian H, the evolution problem is solved in principle if its 


eigenvalues and eigenvectors are all known. 


One then writes the evolution operator U(t,t ), defined by (8-102), in the 


form 


U(t, to) = Uol(t, to) W(t, to), (8-107) 
where Up(= e~ nHo(t—to)) is the evolution operator corresponding to the Hamil- 


tonian Hy, while W satisfies the differential equation 


A 


dW (t, to) 


a 


— Vi(t)W(E, to), (8-108) 


(with the boundary cndition W (to, to) = I, the unit operator) where V!|(t) = 
Uae. ta'V Gig Ul, to) represents the interaction potential in the interaction 
picture, in which any operator A, defined in the Schrédinger picture, 
evolves as All(t) = Up(t,to)' AUo(t, to), while a density operator evolves as 
p(t) = W(t, to)/(0)W(t, to)’, p(0) being the density operator at the initial 
time t) at which all operators and density matrices coincide in all the 


three pictures. It is convenient to recast eq. (8-108) in the form of an 
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integral equation, 
A A a t A 
W(t,to) =I - 5 | dt'V"(t'\W(t', to), (8-109) 
0 


( = unit operator) since, written in this form, W can be solved for itera- 


tively when V represents a sufficiently weak interaction. 


The mean value of an observable represented by the operator A is then 


obtained as 
(A) (t) = Tr(a (2) AM (e)), (8-110) 


which agrees with (8-101) and (8-105) in virtue of the definitions of pl"|(t), 


A(t), and of the cyclic property of the trace operation (check this out). 


8.4.3 The regression hypothesis 


Much of non-equilibrium statistical mechanics starts from the regres- 
sion hypothesis of Onsager, which states that (appropriately defined) 
macroscopic variables characterizing the instantaneous state of a non- 
equilibrium system close to an equilibrium state (under specified con- 
straints, such as that corresponding to a given temperature) approach 
their equilibrium values in proportion to correlations of fluctuations relat- 
ing to (relevant) phase space functions where the correlations are to be 
worked out with the functions evaluated at appropriately specified time 
instants. At times, this is referred to as the regression theorem where we 


prove the assertion made above within the classical framework. 
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Consider a system with Hamiltonian H(z) = H(R,P) where we abbreviate 
2!\l to z (see sec. 8.4.1, where the symbols R,P were introduced as ab- 


N] pl)), prepared at time ¢ = 0 in a non-equilibrium state 


breviations for r! 
as follows: we imagine that a constant perturbation —A(z)h was imposed 
on the system in the distant past (¢t — oo) and that the perturbation was 
maintained up to t = 0 by which time the system has settled down to the 


equilibrium state corresponding to the Hamiltonian 
H'(z) = H(z) — A(z)h, (8-111) 


where h is to be treated as a coupling constant which will be assumed in 
the following to be a small parameter. The state at t = 0, which we denote 
by p(z;0) is the probability distribution in the phase space (we continue 
with the practice of referring to states and probability distributions inter- 


changeably) given by 
p20) = cane (8-112a) 


where A = A(z) is assumed to be an observable (i.e., a real-valued phase 
space function) chosen appropriately, which we need not specify. One can 
look upon the state p(0) = p(z;0) as a ‘constrained’ equilibrium because of 
the presence of the constant perturbation — Ah. In the following, we adopt 
a Heisenberg-like description of the time evolution of the system where 
the state p(t) at time t¢(> 0) will coincide with p(0), and will be denoted 
by p, to be identified with the right hand side of (8-112a). Accordingly, Z 
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stands for the partition function 
oi / dze Pan (8-112b) 


In the Heisenberg type description, the observable A undergoes a time 
evolution, but we will only need to refer to it at t = 0 so that, explicitly, 
A = A(0), this being the representation of the observable in a Schrédinger 


type description. 


Consider now an observable B whose time evolution we will be interested 
in. In the Heisenberg type description B(t) = B(z;t) will differ from B(0) = 
B(z;0), where the dependence of B on ¢ is through z(t), the phase space 
point obtained from z by the dynamical equation corresponding to the 
Hamiltonian H (recall that the perturbation is switched off at t = 0; we 
are interested in the time evolution for t > 0 when the system is in the 
process of relaxing to the equilibrium state pertaining to the intrinsic 


Hamiltonian H). 


Let B(t) denote the mean value of the observable B at time t, given by 


B(t) = ; / dzB(t)e 84-4»), (8-113) 


It is important to note what B(t) stands for. Since the Hamiltonian at times 
subsequent to t = 0 is H, ie., the intrinsic Hamiltonian without the per- 
turbation, B(t) stands for the observable resulting from the Heisenberg type 
evolution under the intrinsic Hamiltonian H from B = B(0). On the other 


hand, in the Heisenberg type description, » remains fixed at its value at 
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t=0, ie., at the value given by (8-112a). 


Note that p(0) and Z, as defined above, differ from the equilibrium dis- 
tribution and the equilibrium partition function corresponding to the in- 


trinsic Hamiltonian H, which we denote by p,q and Z,., respectively: 
i 
pale a 14 = pac (8-114) 
eq 


We now make use of (8-112b) in (8-113) and consider a power series 
expansion in the small parameter h, ignoring terms of degree > 2, which 
will define the linear regime in the present context. We then have the 


following approximations in the linear regime 


1 
eB) wx e-PM(1 4 BRA), ZY & Zog [lL — / dee" BhA], (8-115) 


eq 
and, in consequence, 


™ 1 


1 
B(t) & Z 
eq 


if dze¥ B(t) + [dco onaBeo “F 


eq 


dze®" BhA], (8-116a) 


(check this out), i.e., 


B(t) = (B(t)) — Bh(A)(B(t)) + BR(AB(t)). (8-116b) 


In this expression, the angular brackets denote the average over the equi- 
librium ensemble corresponding to the ‘intrinsic’ Hamiltonian H, i.e., over 
the ensemble, given by (8-114), that the system finally relaxes to (check 
the above result out). The equilibrium ensemble, however, is invariant 


under time translation, i.e., (B(t)) = (B) where, as in the case of the ob- 
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servable A, B = B(0) = B(z;0). In other words, we have the final result 


AB(t) = B(t) — (B) Bh(A(0)B(t)) — Bh(A)(B). (8-117a) 


In this equation AB(t) stands for the deviation of the non-equlibrium 
average (at time t) from the equilibrium average of B, and thus tells us 
how the observable B(t) relaxes (or, regresses) to its final, equilibrium, 
value when the perturbation is switched off. The right hand side of the 
above formula, on the other hand, can be expressed as 2h(A — (A))(B(t) — 
(B)) (check this out), and the above result can then be written in the 


compact form 


AB(t) = Bh(5A(0)6B(t)), (8-117b) 


where, for any phase space function f(t) = f(z;t) = f(<(t)) we define 
of(t) = f(t) — (f). The right hand side of the above formula gives the 
correlation between fluctuations of A and B at times 0 and t(> 0) respec- 
tively, and thus the formula relates the ralaxation of B(t) to the corre- 
lation between the fluctuations of A and B at a time interval of t (the 
time translation invariance of the equilibrium configuration ensures that 
a correlation of the form A(r)B(t) depends on 7,¢ only through the differ- 
ence t —7). This, then, is the precise content of the regression hypothesis 


stated at the beginning of this section. 


Recall that the role of the observable A here is to define the non-equilibrium 


configuration that the system starts from at t = 0. At times, one considers 
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the regression of A(t) itself starting from the non-equilibrium configura- 
tion prepared as the equilibrium configuration for the perturbed Hamil- 
tonian H — Ah, when one ends up with the formula (see [19], chapter 


8), 


AA(t) = Bh(6A(0)5A(t)), (8-117) 


where it is to be recalled that the formula has been arrived at on ignor- 
ing terms of the second and higher degrees in h, this being the condition 
defining the linear regime of non-equilibrium evolution. From the physi- 
cal point of view, this means that all fluctuations from equilibrium values 


should be sufficiently small. 


On the face of it, it might seem that the initial state p(z) = p(z;0) from 
which the system is made to relax, cannot be specified arbitrarily since it 
has been chosen to be of the form (8-112a) (recall that, in the Heisenberg 
type description adopted in the present section, this remains unaltered 
throughout the time evolution, i.e., p(z) = p(z;t); this differs from the 
equilibrium distribution p,, pertaining to the Hamiltonian H). However, 
it can be seen that given an arbitrarily chosen distribution function p(z) 
(satisfying the normalization condition { dzp(z) = 1) one can choose an 
observable A(z) such that (8-112a) is satisfied, an approximate solution 
for hOA = h(A(z)—(A)) (note that it is hdA that is relevant in the statement 


of the regression theorem) being given by 


hdA(z) = B [Zego(z)e?™ — 1), (8-118) 
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which satisfies the consistency condition (dA) = 0 (check this out; the 


condition (5A) = 0 follows from the definition of 6A). 


We will take up the process of relaxation from a constrained equilibrium 
again in sec. 8.4.7.3, considered in the general context of non-equilibrium 
processes initiated from equilibrium states under small perturbations 
imposed upon the relevant unperturbed Hamiltonian. There, the relax- 
ation function will be defined in both classical and quantum terms, with 
reference to the response function defined for such more general non- 


equilibrium processes. 


8.4.4 The fluctuation-dissipation theorem: classical 


The fluctuation-dissipation theorem (FDT in brief) is, in a sense, a gener- 
alization of the regression hypothesis. As we saw in 8.4.3, the regression 
theorem tells us how, given an initial non-equilibrium configuration, the 
system under consideration relaxes such that, as ¢ goes to oo, the devi- 
ation of the mean of any observable B(t) = B(z;t) from its equilibrium 


value (B) goes to zero. 


As the system approaches equilibrium at t > oo, all correlations of the form 
(0 A(0)5 B(t)) reduce to (6A) (6B) which, by definition, is zero. The question as 
to why all correlations are lost in the large ¢ limit is the central question of 
statistical mechanics, and will be briefly addressed in a subsequent section 


(sec. 9.4), and will come up at several other places in this book. 


We now generalize the context by asking how the system responds to 


a small but arbitrarily chosen time dependent perturbation. Thus, we 
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assume that the Hamiltonian H = H(z) is perturbed to 


H' = H — Ab(t), (8-119) 


where A = A(z) is an arbitrarily chosen phase space function representing 
an observable for the system, and h(t) is a time dependent coupling such 
that |h(t)| remains small at all times, in consequence of which the square 
and higher powers of h(t) can be ignored, this being the condition that 


defines the linear regime in the present context. 


We adopt here the Schrdédinger type description, which is complementary 
to the Heisenberg type description adopted in 8.4.3, and assume that, 
at some initial time t = to, the system is in a state described by the 
probability distribution p(z;t 9) in the phase space. Then, at some later 
time t, the state of the system will be p(t) = p(z;t), given by the Liouville 


equation (refer to (8-9 1a)) 


Op ae 
ay = iL'e, (8-120a) 


where £’ corresponds to the perturbed Hamiltonian H’ = H — Ah(t), and 
differs to a small extent from the operator CL pertaining to the Hamiltonian 


FH: 


L’=L+4+06L, (8-120b) 
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the action of /£ on a phase space function f = f(z) being given by 


SLF =rh(t){ f, A}. (8-120c) 


Noting that 6£ = L' — L is of degree | in A(t), one can expand p(t) ina 


series of the form 


P= po Pi-- Pas, (8-121) 


where p,, (n = 1,2,---) is of degree n in h(t) and, in determining the time 
evolution of p, we ignore terms of degree > 2 so as to confine our consider- 
ations to the linear regime. We make the definition of our problem precise 
by assuming that, at the initial time ¢ = to, the system is in equilibrium 
with reference to the unperturbed Hamiltonian H, i.e., p(to) = peq corre- 
sponding to some given temperature T = (kp)~'. Since, in the absence of 
the perturbation dL, p(t) would continue to be p.,, one can assume f(t), 


the leading term in the expansion of p(t), to be given by 


po(t) = Peq (t = to), (8-122a) 


and then determine p(t) (the term linear in h(t) in the expansion for p, 
this being our object of interest in the linear regime) by inserting the 
expansion (8-121) in (8-120a) and equating terms linear in h(t). This 


gives 


“es + Lp, = —6Lpy = h(t) {Peqs A}. (8-122b) 
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This equation can now be integrated to give p;(t): 


t 
pilt) = / drh(r)e*"-™F { eq, A}, (8-123) 


to 


where we make use of the initial condition p,(to) = 0. In the following, we 
refer to H,—i£ as the ‘intrinsic’ Hamiltonian and the ‘intrinsic’ Liouville 
operator for the sake of easy reference, since these determine the evolu- 
tion in the absence of interaction with an external field (represented here 
for the sake of simplicity by h(t)). With this understanding, the action of 
the operator e~“(-7)£ results in a ‘free’ evolution (i.e., evolution free from 


external influence) through a time interval t — Tr. 


In the following, all observables in the Schrédinger type description 
will be represented by time-independent phase space functions, i.e., 
observables with explicit time dependence will be left out of consider - 


ation. 


We now consider an observable represented by the phase space func- 
tion B = B(z) and work out its ‘mean deviation’ AB(t) at time t, defined 
by the first equality in (8-117a) where, in the Schrédinger type descrip- 
tion, B(t) denotes the mean value of the observable B under the non- 
equilibrium probability distribution p(t). In the linear approximation we 


have, from (8-123), 


t 
p(t) = peq + / drh(r)e"-” { beq, A}, (8-124) 


to 
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which gives 


t 
AB(t) = / / dzdth(r) B(z)e"-" { eq, A}, (8-125) 
to 


(check this out). Now, p,,(z) depends on the phase space co-ordinates 


only through the intrinsic Hamiltonian H, and thus, 


OPeg 


ap tH A} = BPeqA(0), (8-126) 


1 Deas A} a 


where, in the final expression, A(0) has the following interpretation. Imag- 
ine the Heisenberg type evolution, pertaining to the intrinsic Hamiltonian 
H, of an operator that, at time t = 0, is represented by A(z). Then, A(0) 
stands for the time rate of change of that operator, in its imagined Heisen- 


berg type evolution, at t = 0. 


One then obtains 
— t Ps . 
ABE) = af [ dearh(r)B ee" pag A(0), (8-127) 
to 


We now make use of the unitarity of the operator e—t-7)£ (refer to sec. 8.4.2) 


and obtain 
_ t . er 
AB(t) = B / / dzdth(T) peqA(0)e-7* B(z), (8-128) 
to 


where we have made use of the fact that the phase space functions {p,,4, A} 


and B(z) are both real. Finally, performing the phase space integration, 
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we obtain 


ddA 


Ai (O)6B(t —7)). (8-129a) 


ABG)= af drh(r)<{ 


where 6A,6B are defined as A — (A), B — (B) respectively and, as already 
mentioned, the angular brackets (-) denote the mean value with respect 
to the equilibrium distribution p,, pertaining to the intrinsic Hamiltonian 
H. In the above formula, the expression e’“‘-7)B(z), has been written 
as B(t — 7), where the latter represents the operator resulting from the 
Heisenberg type evolution, in time t—7, of one that is represented by B(z) 


att=0. 


Though we have agreed to work out the time dependence of 5B(t) in a 
Schrodinger type description, the expression on the right hand side of (8- 129a) 
is in the nature of an inner product, where one can switch over to a Heisen- 


berg type description of the time evolution. 


The mean value of an observable, say B, denoted by a bar (B in the case 
of a non-equilibrium configuration) or by angular brackets ((B) in the case 
of an equilibrium configuration) is, of course, independent of whether one 
follows a Schré type or a Heisenberg type description; in the present section 
we have worked out the expression for the mean deviation AB(t) by adopt- 
ing a Schrédinger type description, though, in the course of this derivation a 
number of terms have come up that denote time dependent values of observ- 
ables — ones that would come up if the latter were to follow a Heisenberg 


type evolution. 
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In writing (8-129a) I have used, in the interest of generality, 6A, 6B 
in (8-129a) instead of A,B by invoking the identity 4(A(t)) = 0, which 
expresses the invariance of the equilibrium distribution under time trans- 
lation. In the following we will often replace 5A,5B with A, B by assuming 
that (A) = 0, (B) = 0; one can always redefine an observable so as to 


convert it to one with zero mean. 


In the present context of the problem of non-equilibrium evolution in 
the linear regime, we take the initial time t) to be in the remote past 
(t9 + —oo). In other words the system is assumed to be in the equilibrium 
state pertaining to the intrinsic Hamiltonian H at time —oo, after which 
the perturbation —Ah(t) applied to H, which causes the deviation AB(t) 
of the non-equilibrium mean value, at time t, of the observable B from its 


equilibrium expectation value. 


With all this in mind, we now have the simpler formula 


AB(t) = 8 i. drh(r)(A(0)B(t — T)). (8-129b) 


This formula can be taken as a statement of the fluctuation-dissipation 
theorem in the classical context, though there exist a number of equiva- 


lent statements, some of which will be mentioned below. 
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8.4.4.1 The response function: classical 


One can define a response function corresponding to a perturbation of the 
form H' = H — Ah(t) by writing the mean deviation of the observable B at 


time ¢ as 


ABQ) = / dt Riz ,(t — T)h(r). (8-130a) 


(oe) 


By comparing with (8-129b), one obtains, 


pa(t) = 8(A(0)B(t)) = —8(A(0)B(t)). (8-130b) 


1. The second equality in (8-130b) is obtained as follows. Consider the 


first equality written at some time ?’: 
Rya(t’) = B(AO)B()). (8-131a) 


Noting that the equilibrium expectation value (A(t) B(t+t’)) is indepen- 


dent of ¢ by invariance under time translation, we have 
(A(t)B(t+t’)) =-(A()B(E+ #')). (8-131b) 
Taking t = 0 in this formula, one obtains 


(A(0)B(’)) = -(A(0) Bt). (86-1310) 


On using the variable ¢ in the place of t’ now, we arrive at the required 


formula. 


2. The response function (and hence the fluctuation-dissipation theorem) 
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can be expressed in one of a number of equivalent forms. For in- 
stance, making use of the unitarity of the operator e~”“, one can write, 


from (8-125) (refer to [82]), 
Rpalt) = f de{P0q, AM) BC), (8-132) 


(check this out). 


The fluctuation-dissipation theorem, as stated above in (8-129b), ex- 
presses the relation between the mean deviation, or the response function 
defined in eq. (8-130a), and the correlation between a corresponding pair 
of observables (A and B) at unequal times, the latter being an instance of 


what are generally referred to as ‘time correlation’ functions. 


The important thing to note about (8-130a) is that only those values of 7 
that satisfy 7 < t are relevant, i.e., in (8-130b), one needs to know R’, ,(t) 
only for t > 0. This reflects the fact that the response to a perturbation 
follows the dictates of causality. Regardless of whether one defines the re- 
sponse function in the classical or the quantum framework (see sec. 8.4.6 
below), the mean deviation AB(t) arises as the effect of the perturbation 
—Ah(t) imposed on the intrinsic Hamiltonian, where the perturbation acts 
as the cause — the cause has to precede the effect. The response function 
determines the effect at time t as a superposition of the cause acting at 
various time instants 7 in the range —oo < 7 < t. Such a representation 
as a superposition is possible since we have confined our considerations 


to the linear regime. 
From a mathematical point of view, the restriction imposed on the re- 
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sponse function, defined in (8-130b), by the requirement of causality, to 
the effect that the time argument has to be positive, is removed by defin- 
ing what we will refer to as the causal response function Rga(t) defined 


as 


Rpa(t) = O(t) Rig (t), (8-133a) 


where O(t) stands for the Heaviside step function, 


O(t) =1 fort > 0 


=() tor t=. 0: (8-133b) 


The term ‘causal response function’ is not a standard one in the literature. 
We use it in this book to distinguish between Rz,4(t) and R’,,(t) as defined 


above. 


Because of the inclusion of the Heaviside function in its definition, one 
need not impose any restriction on the argument of the causal response 
function Rz,(t), which makes it easier to work with it in the frequency 
domain. However, though only the positive values of the argument of 
the response function R’/,, are relevant in describing the mean deviation 
in (8-130a), the function R’,,(t) itself is well defined for positive as well 
as negative values of the argument by means of (8-130b) or equivalent 
expressions. Indeed, it is no less useful than the causal response func- 
tion Rpa(t) in the context of the fluctuation-dissipation theorem and its 


consequences. 
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In our derivation above, h(t) plays the role of a time dependent driving force, 
coupled to the observable A of the system under consideration. In most 
situations of interest, it is the result of some external agency acting on the 
system. For instance, in the case of a magnetic system, it can be a magnetic 
field, in general time dependent. For a system distributed in space, one can 
more generally have an external field h(r,t) coupled to an observable that is 
itself in the nature of a field, having a space dependence. We will summarize 


this more general situation in sec. 8.4.4.2. 


With the causal response function introduced as above, one can replace 
the upper limit of integration in (8-129b) with oo and express the fluctuation- 


dissipation theorem in the form 


AB(t) = / * Cents a); (8-134) 


(oe) 


At the cost of repetition, I emphasize that in all these expressions for the 
mean deviation, i.e., the difference between the non-equilibrium average 
at time ¢ and the equilibrium average of an observable, given in terms of 
the relevant response function, the latter are defined by means of equilib- 
rium averages alone and, moreover, the time dependence of the operators 
appearing in these averages occurs only through the intrinsic evolution. 
In other words, the linear approximation makes it possible, in principle, to 
completely solve a non-equilibrium problem by referring to the intrinsic 
Hamiltonian H(z) since one needs to know only the evolution of observ- 
ables under H(z) and evaluate ensemble averages over the equilibrium 


ensemble pertaining to H(z). 
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The causal response function Rg,4(t) can be interpreted as an impulse re- 
sponse, since it gives the response AB(t) to a delta-function perturbation 
of unit strength acting at 7 = 0 (h(rT) = 6(7)); check this out; in other 
words, Rga(t) is the functional derivative of AB with respect to h(r) at 
T = 0). A similar statement can be made about F’,,(t) provided that one 


keeps in mind the restriction on the time argument (¢ > 0). 


In the following (sec. 8.4.7) we will re-formulate the fluctuation-dissipation 
theorem as a relation between frequency domain representations of re- 
sponse functions and of appropriate time correlation functions. These 
are referred to as dynamic susceptibilities and dynamic structure factors, 
where these functions have important experimental implications. As we 
will see, the causal property of the response function defined in (8-130b) 
has important consequences in the frequency domain representation, ob- 
tained from the time domain representation considered above, by means 


of Fourier transformation. 


8.4.4.2 Linear response theory: coupling to external fields 


For a system distributed in space, observables are, in general, fields, i.e., 
functions of the position vector r (with some specified choice of the origin) 
and may, in addition, carry a discrete index a (= 1,2,--- K, say). A pertur- 
bation of the Hamiltonian H is then of the form H’ = H — 5°. ha(r;t)Aa(r) 
where A,(r) stands for a component of the field of an observable of the 
system and h,(r;t) is the corresponding component of an external time 
dependent field interacting with it. The time dependent response to such 


a perturbation, resulting in a mean deviation of an observable field, say, 
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B,(r;t) (y = 1,2,---M) can be expressed as 
AD.(t;t) = > far f dritg.at8 t= 7) halt <7), (8-135a) 


where the response function Rg, 4,(r,1';t), given by 


Rp, ag(t,r5t) = —BO(t)(Aa(t'; 0)By(r; t)), (8-135b) 


8.4.5 Linear response theory: Non-Hamiltonian time- 


evolution 


Up until now we have focused on the linear response of systems under- 
going Hamiltonian time-evolution, where the system is described by the 
intrinsic Hamiltonian H in the absence of the perturbation while the per- 
turbed system is also described by a Hamiltonian (H’ = H — Ah(t)). More 
generally, one is, at times, required to consider a perturbation under 
which the time-evolution of the system is no longer of the Hamiltonian 
type. To be concrete, we consider a perturbation giving rise to equations 
of motion of the form ( [140], chapter 13) 
OH OH 


= Dp: + C;h(t), p= Or: + D,;h(t) (i =1,2,:-- av hy (8-136) 


rj 


where h(t) is the driving function responsible for the weak perturbation, 
and C;,D; (i = 1,2,--- , N) are phase space functions to which the driving 
function is coupled. In the special case of a perturbation under which the 


system continues to be described by a Hamiltonian H’ = H — Ah(t), one 
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evidently has 


C;=-~—, D;= — (i=1,2,--- ,N), (8-137) 


where A stands for some (arbitrarily specified) phase space function. 


However, the linear response theory can be developed along lines analo- 
gous to what has been outlined in sec. 8.4.4.1 even when the functions 
C;,D; are more general than the special forms given in (8-137). In this 


more general context one defines the dissipative flux as 


N 
OH OH 
] i D; + —— i a 5 -l 
a(2) 2 [ — +C Ar (8-138) 
where z(= (r1,°-: ,%v,P1,°::,Pn)) denotes the collective phase-space co- 


ordinate. In the special case of Hamiltonian time-evolution defined by (8-137), 


this reduces to 


j(z) = {H, A} =A, (8-139) 


i.e., the rate of change of the observable A(z) at any given time t, taken 


with a negative sign. 


It turns out that ([140]) the mean deviation (AB(t)) in an observable B(z), 
defined as (refer to the first equality in (8-117a)) the deviation of the non- 


equilibrium expectation value of B at time t from its equilibrium expecta- 
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tion value (pertaining to the intrinsic Hamiltonian H) is given by 
_ t 
AB(t)=—8 f _h(r)(i(0)B(t-7)\dr, (8-140) 


which reduces to (8-129b) in the special case of (8-137) as, of course, it 


should. 


The rest of the theory leading up to the fluctuation-dissipation theorem 
can now be developed, with response functions F’,(t), Rp(t) (pertaining to 


the observable B) given by 
Ra(t) = O(t) Ra(t) = —BO(7)(9(0) B(t)), (8-141) 


these being the generalizations of RF’, ,(t), Rpa(t) introduced in sec. 8.4.4.1. 
The Fourier transforms of these give the dynamic susceptibility and the 
dissipation function in the frequency domain, as introduced in sec. 8.4.7 
below, the frequency domain description being often useful from an ex- 


perimental point of view. 


Non-Hamiltonian time-evolution, not necessarily restricted to the linear 
regime, will be considered in sec. 9.9 in the context of characterizing 


steady states far from equilibrium. 


8.4.6 Fluctuation-dissipation theorem: quantum mechan- 
ical 
In this and the following few sections I follow, in the main, [98]; Mazenko’s 


is quite a remarkable book in the field, principally for its clarity and sweep. 
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In the quantum theoretic context one can follow a parallel line of devel- 
opment, but now the observables and (mixed) states are represented by 
operators that, generally speaking, do not commute among themselves. 
Recalling use of the hat symbol (-) to denote an operator, we consider a 
system with the Hamiltonian H' made up of an intrinsic part H and a 


perturbation —Ah(t), 
A! = H — Ah(t), (8-142) 


where A is an observable coupled to a weak time-dependent external 
agency that we assume to be represented by a scalar factor h(t) (more 
generally, one can consider a scalar field h(r;t) as in sec.8.4.4.2). We con- 
sider the response of the system to the perturbation as in the classical 
case, but now taking into account the non-commutativity of the quantum 
mechanical operators. For this, it is convenient to work in the interaction 
picture briefly outlined in sec. 8.4.2.2, with reference to which one has 
to make the replacements H > H’, Hy) > H, V — —Ah(t). The interaction 
picture representations of A, B (operators in the Schrédinger picture) are 


then (refer to sec. 8.4.2.2) 
All(t) = U§(t, to) AUo(t, to), BU) = Uf, to) BUolt, to), (8-143a) 
where 


Uplt, tg) =e a), (8-143b) 


(we assume that the intrinsic Hamiltonian H is time-independent), while 
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the equilibrium density matrix /,, (pertaining to the intrinsic Hamiltonian 


H ) evolves as 
pl (t) = W(t, to) beqW (¢, to). (8-143c) 


with W defined as in (8-107). 

Since the Schrédinger picture and the interaction picture are equivalent 
to each other, the non-equilibrium average of the observable B at any 
given time ¢ (recall that we have assumed all the three pictures to be 


identical at the initial time ¢ = to) is given by 


B(t) =Tr(al a) B4 (2) 


=Tr(W (t, to) PeqW"(t, to) BM (t)). (8-144a) 


One can now insert in this expression the solution to the integral equa- 
tion (8-109). A convenient way to do this is to expand the solution in 
powers of the coupling h(t) (here the symbol h is not to be confused with 
the Planck constant; the latter occurs in the present chapter in the form 
h = +) by an iterative procedure and to truncate the resulting series at 
some appropriate stage. Noting that the leading (and trivial) approxima- 
tion is W(t) = I, one obtains the next-to-leading solution by substituting 
W = I in the second term of (8-109) so as to end up with the linear 
approximation (with V(t) = —A"n(t)) 


A a 


t 
W(t) eI+ : Fl drh(r) AM(r). (8-144b) 
0 


Inserting in (8-144a), one obtains the expression for B(t) in the linear 
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approximation, that can be supposed to work for sufficiently weak per- 


turbations, where |h(t)| remains small at all times. 


With W(t) given by (8-144b), one obtains 


t 
W(t) PeaW (BM (t) % pog"(t) — 2 | drh(r)poqlAM(r), BU(t)], (8-145) 
0 


(check this out). One now has to insert this expression into (8-144a), for 


which we note that (refer to (8-143a), (8-143b)) 


Tr(Bo,B (8) —Tr(Poget Be At) 


=Tr(Poq) = (8), (8-146) 


where we have made use of the fact that /.,, being the equilibrium den- 


sity matrix pertaining to the intrinsic Hamiltonian H, commutes with 


ei 4(t-to), and also of the cyclic property of the trace operation (check 
this out). Recall that the symbol (-) stands for the equilibrium ensemble 


average for the unperturbed system. Further, 


Tr(Beq A" (7) BU (t)) =Tr(Pege® 0-0) Ae~ tH" —t0) @ i BUEt0) Be nH (t—to)) 


=Tr(peqAer#-) Beh) = (AB(t—7)), (8-147) 


where we have once again made use of the cyclic property of the trace 


operation and of the fact that #., commutes with e1(‘) 


. In the final ex- 
pression above, B(t—7) stands for the observable B at time t— 7 obtained 
from the Schrédinger picture operator B by time evolution through an in- 


terval t —r by means of the intrinsic Hamiltonian H. In a similar manner, 
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one obtains 
Tr (fe B"(t) A" (7)) = (B(t — 7) A). (8-147b) 


One then finally gets, from (8-145), the expression for the mean deviation 


as 
AB(t) = B(t) — (B) = ai h(r)((B(t — 7), A]). (8-148a) 


8.4.6.1 The response function: quantum mechanical 


Analogous to the classical case, eq. (8-148a) is again of the form 


_ t 
AB(t) = / drR'z,(t — T)h(r), (8-148b) 


(oe) 


where the quantum mechanical response function fF‘, ,(t), defined only 


for t > 0, is given by 


[quantum mechanical :] R’,4(t) = (BO), A)). (8-149a) 


The causal response function Rg,(t), defined for arbitrarily chosen values 


of t, is then of the form 


[quantum mechanical :] Rpa(t) = = (8) (14 B(t)]), (8-149b) 


in terms of which the mean deviation for B at time t is of the form 


AB(t) = / * dae =e. (8-150) 
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The formula (8-148a) or, equivalently, (8-149b), embodies the statement 
of the quantum version of the fluctuation-dissipation theorem. However, 
as mentioned earlier, a more useful form of the theorem is the one relating 
the frequency domain representations of the mean deviation B(t) and the 
time correlation function involving the operators A,B. This we will look 


into in sec. 8.4.7.2 below. 


1. Eq. (8-149a) yields an alternative expression for the classical response 


function, obtained by going over to the limit h > 0, wherein > times 


the quantum mechanical commutator of two operators goes over to the 
Poisson bracket of the corresponding phase space functions. In other 


words, one has (see [82]) 
[classical :] Rh 4(t) = ({A, B(t)}), (8-151) 


which constitutes an equivalent expression to formulae (8-130b), (8-132). 
The classical causal response function is obtained from this by multi- 
plying with the step function O(t). 


2. The quantum mechanical expression (8-149b) can also be written in 


the form 


[quantum mechanical :] Re a(t) = = a(t) (64, 6B(t)]), (8-152) 


where, for any observable C’, 5C = C — (C) (check the above statement 


out). 


As in the classical case, the response function R’, ,(t) is well defined for all 
values of its argument, though only the positive values of the latter are 
relevant in determining the mean deviation in (8-148b). As we will see 


below, the fluctuation-dissipation theorem can be conveniently stated in 
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the frequency domain by means of a the Fourier transform of ’,,(t) (or 
in term of a related function y‘3,(w), see below), as in (8-161b). The rela- 
tion between R',, and the causal response function Rg, in the frequency 


domain will be indicated below (sec. 8.4.8) 


Incidentally, the generalization to non-Hamiltonian time-evolution (sec. 8.4.5 
above) does not have a systematically developed quantum theoretic analog. 
It is likely, though, that a generalization in the quantum mechanical con- 
text can be formulated in terms of the the Kubo correlation function, defined 
below in sec. 8.4.7.4. We will, however, not pursue this possibility in this 


book. 


8.4.7 Response and correlation functions: the frequency 


domain 
8.4.7.1 The frequency domain representation: Introduction 


Looked at from the experimental point of view, the response functions 
and the correlation functions are best worked with in the frequency do- 
main representations, obtained by Fourier transformation from the time 


domain representations considered in the above sections. 


Notice, first of all, that the formula for the mean deviation, as expressed 
in (8-134) or (8-150) is in the form of a convolution between the causal 


response function and the external perturbation factor h(t), which implies 
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the following simple relation in the frequency domain, 


AB(w) = xpa(w)h(w), (8-153) 


where AB(w), Xpa(w), and h(w) are Fourier transforms of AB(t), Rza(t), 


and h(t) respectively. 


The Fourier transform of a function f(t), specified over the interval from —oo 


to co is defined as 
fw)= / f(te*dt, (8-154a) 
with the inverse transformation given by 


f(t)= ~ / 7 f(wye* dw. (8-154b) 


The default notation for the Fourier transformation of a function f(t) is 
f(w) where a tilde is inserted over the symbol used to denote the function, 
while a separate symbol for the Fourier transform is also used with explicit 
mention, as in the case of Rg, above. Often, for the sake of easy reference, 
the Fourier transform of f(t) is denoted by f(w), without the tilde overhead, 
by retaining the same symbol (f in the present example) and just changing 
the argument (t to w), where the intended meaning is to be read from the 


context. 


While the formula (8-153) is written for the classical mean deviation, the 
notation will be slightly different in the quantum mechanical context so 


as to remind us that the observables and (mixed) states are represented 
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by operators, denoted with hat symbols overhead. Thus, for instance, 
AB(w) will have to be replaced with AB(w). However, in the interest of 


simplicity, the notation can be modified, with explicit mention if neces- 


sary. 


The fluctuation-dissipation theorem relates the response function to a 
time correlation function. For instance, eq. (8-133a), read with the second 
line in eq. (8-130b), relates the classical causal response function with 
the derivative of time correlation function (A(0)B(t)). Generally speak- 
ing, a time correlation function is essentially the equilibrium expectation 
value of a product of two phase space functions, evaluated at distinct 
time instants, where the equilibrium expectation value pertains to the 
intrinsic Hamiltonian H and the time dependence of an observable also 
refers to the Heisenberg type description of the time evolution by means 
of the intrinsic Hamiltonian. In the quantum mechanical context, a time 
correlation function admits of alternative definitions because of the non- 
commutativity of the observable. For instance A(0)B(t) stands for an 
unsymmetrized time correlation function, while the symmetrized version 
1(A(0)B (t) + B(t)A(0)) is more commonly referred to. On the other hand, 
the anti-symmetric correlation function, i.e., the commutator [A(0), B(t)] 
is directly related to the quantum mechanical response function R’‘, ,(t). 
As we will see below, there exist alternative formulae relating y,,4(w), de- 
fined quantum mechanically, to the Fourier transforms of these various 


quantum mechanical time correlation functions. 


In the following, we will first look at the quantum mechanical response 


function and the quantum mechanical correlation functions in the fre- 
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quency domain, which lead to useful frequency domain formulations of 
the quantum mechanical fluctuation-dissipation theorem. A number of 
expressions falling under the category of quantum correlation functions 
all reduce to the classical correlation function in the limit h — 0, leading 
to the classical fluctuation-dissipation theorem in the frequency domain. 
In the process, we will also have a look at the classical and the quantum 


mechanical relaxation functions in the frequency domain. 


The function yga(w) is termed the dynamic susceptibility with reference 
to the observables A and B, and is defined for real values of the ar- 
gument w. The correlation functions, on the other hand are related to 
the dynamic structure factors obtained from scattering experiments. The 
fluctuation-dissipation theorem establishes a relation between these two 


sets of quantities. 


8.4.7.2 Dynamic susceptibility and the dynamic structure factor 


In the present section, we first set up a relation between the function 


x$4(w), defined as + times the Fourier transform of R’, ,(¢) 


1 fe, 
Xba) = 5 i dte” Rp a(t), (8-155) 


co 


with R’,,(t) given by (8-149a), and the Fourier transform (S(w)) of the 


unsymmetrized correlation function 


Sea(t) = (B(t)A), (8-156a) 
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St / dte™ p(t). (8-156b) 
1. The function 
vbalt) = 5Realt), (8-157) 


will occur frequently in our present context since its Fourier transform 
x'44(w) will be seen to signify the imaginary part of yz4(w), the Fourier 
transform of the causal response function Rz,(t) (see sec. 8.4.8). It 
determines the rate of energy dissipation in the system in the course 


of non-equilibrium evolution (sec. 8.4.9). 


2. As mentioned above, we follow here the convention of denoting the 
Fourier transform of a function (of time ¢ in the present context) by 
retaining the same symbol as for the function itself and just replacing 


the argument (¢ with w). 


3. It is worthwhile to recall the roles of the observables A, B of the sys- 
tem under consideration, so as to appreciate the significance of the 
sub-index BA in the above formulae: the second symbol in the sub- 
index refers to A, the observable that couples to the external driving 
force (h(t)) that perturbs the system away from the equilibrium state 
pertaining to the intrinsic Hamiltonian H; the first part, on the other 
hand, refers to the observable on which the effect of the perturbation 
is expressed by means of the mean deviation AB(t) This introduces a 
slight discrepancy between the formulae written below and the corre- 
sponding formulae in [98], chapter 3, where you will find the symbols 
A and B interchanged. The content of the present section is based on 


Mazenko’s book. 
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Recalling that the angular brackets ((-)) are used to denote the ensemble 
average pertaining to the intrinsic Hamiltonian H at temperature 7’ = 


(kpG)~', we write 


Spa(t) = z Tr(e~o" B(t) A), (8-158a) 


where Z., = Tre~®”. We rewrite this as 


= 1 fy iy A iq fy fy 
Spa(t) = Z Tr(e Pr ek Be net eh PH A). (8-158b) 


eq 


We now interpret the operator e~*"en' = e894 as the adjoint of the 
evolution operator through a complex time t + 13h, and then express the 


above formula as 


Tr(B(t + iBh)e°" A), (8-158c) 


= 1 
Seal). = Z 


eq 


Making use of the cyclic property of the trace operation, one finally ob- 


tains 


Tr(e°" AB(t + iBh)) = S4p(—t — ifh), (8-159) 


= 1 
Spa(t) = Z 


eq 


(reason this out). On taking the Fourier transform we end up with the 
following symmetry property of the correlation function in the frequency 


domain 
Spa(w) = Sap(—w)e?™, (8-160) 


(check this out). 
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We now refer back to formula (8-155) and consider the function ‘5, in 


the time domain written as 


vbalt) = Rall) = 518), Al) 
= 5 (Spall) ~ San(-0)), (8-16 1a) 


where we have made use of the invariance of the equilibrium distribution 
under time translation. Taking the Fourier transform and making use of 
the symmetry property (8-160) we finally arrive at the following formula 
relating the response function y‘},(w) to the (unsymmetrized) correlation 


function (S3,4(w)) in the frequency domain, which we set out to derive: 


" 1 = ere ra 
Xpa(w) = a Bal); (8-161b) 


(check this out). 


This constitutes a useful form of the fluctuation-dissipation theorem, 
though other equivalent formulations in the frequency domain are also 
common where dynamic susceptibilities (i.e., Fourier transforms of re- 
sponse functions) are related with correlation functions analogous to 


Spa(w). A few of these relations will be indicated below. 


For instance, one can make use of (8-160) to express ,‘,,(w) in terms 
of Sga(w), the Fourier transform of the symmetrized correlation function, 


Ley 


(AB(t) + B(t)A)) (8-162a) 
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which gives 
(AB(w) + B(w)A)). (8-162b) 


It is straightforward to see that the required relation, constituting a com- 
monly stated form of the (quantum mechanical) fluctuation-dissipated 
theorem, is 


gh 


1 
[quantum mechanical :] x‘g4(w) = = tanh( 5 


i \Spa(w), (8-163) 


(check this out). 


In the classical limit (h — 0) the problem with non-commutativity disap- 


pears and one is left with a single correlation function, 
[classical :] S'ga(t) = Spa(t) = (B(t)A), (8-164a) 


which gives the following form of the classical fluctuation-dissipation the- 


orem in the frequency domain 


[classical :] x‘54(w) = Pe Spalts). (8-164b) 


This result could also be obtained straightaway by Fourier transforma- 
tion from the second line of (8-130b), using the definitions of y‘5,(t) (= 
+R, 4(t)) and from the classical correlation function (8-164a) (check this 


out). 


Recall that the correlation Sz,4(t) denotes the way a fluctuation in B (B 
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in quantum theory) at time ¢ is correlated with that in A at time 0, while 
x‘n4(t) is proportional to the deviation in B at time t caused by a fluctu- 
ation in A at time 0. Their relation in the frequency domain expresses 
the fact these are effectively the same things in the special case that the 
variations occur in a simple harmonic manner, which implies a corre- 
sponding relation for arbitrary time variation as well. In other words, 
the fluctuation-dissipation theorem signifies that the temporal variation 
of a small deviation in a system parameter remains the same whether it 
is due to a perturbation imposed externally or one caused by intrinsic 
fluctuations in the system: in the linear regime there is no way to distin- 
guish between the two. In this sense, the fluctuation-dissipation theorem 


constitutes an elaboration of Onsager’s regression hypothesis. 


Correlation functions of a system in the frequency domain are inferred 
from scattering experiments. In a typical such experiment, the system is 
probed by means of a beam of particles of well-defined energy and mo- 
mentum (or by what can be interpreted as a plane wave of well-defined 
frequency and wave vector) and the distribution of scattered particles in 
various directions is observed. Assuming that the probe does not mate- 
rially alter the spontaneous equilibrium fluctuations within the system, 
the distribution in the scattered field tells how the probe gets affected 
by these equilibrium fluctuations. The correlation of fluctuations within 
the system inferred from such experiments are generically referred to 
as dynamic structure factors. On the other hand, response functions in 
the frequency domain can be inferred from spectroscopic data in which a 


similar beam is used to weakly perturb a system and the response of the 
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system is inferred from the resulting absorption and emission spectrum. 
These response data in the frequency domain are generically referred to 
as dynamic susceptibility. The fluctuation-dissipation theorem (formu- 
lae (8-163), (8-164b)) provide instances of the relations between the dy- 
namic structure factors (Szg4(w)) and dynamic susceptibilities (vp4(w)”, 
related to yp,4(w), see below, sec. 8.4.8) that have been substantiated by 


scattering and spectroscopic data. 


I repeat that the fluctuation-dissipation theorem can be expressed in one 
of several forms, relating a response function (Rg,(t) or R,,(t)) and a 
correlation function ($3,4(t) or Sg,(t)) in the time domain, or their Fourier 
transforms (a dynamic susceptibility and a dynamic structure factor, us- 
ing these terms in a generic sense) in the frequency domain. It essentially 
signifies the fact that, in the linear regime, a response generated by an 
external driving force is indistinguishable from intrinsic equilibrium fluc- 


tuations in the system. 


In sec. 8.4.3, we discussed the process of relaxation from a constrained 
equilibrium and saw that it is essentially similar to the regression of in- 
trinsic fluctuations. In this sense the fluctuation-dissipation theorem can 
be looked upon as a generalization of the regression hypothesis (or the re- 
gression theorem, as it is often referred to). In addition to the correlation 
functions S and S, the quantum mechanical description of the process of 
relaxation from a constrained equilibrium brings in another correlation 
function, the so-called Kubo correlation function. This we briefly review in 


sec. 8.4.7.3 below. 
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Incidentally, experimental set-ups often tell us about the autocorrelation 
relating to an observable A, i.e., a correlation function of the form S44 
or S44 — one for which the observable B (or the phase function B(z) in 
the classical context) is the same as A, the observable ‘conjugate’ to the 
driving force. The fluctuation-dissipation theorem then relates such an 
autocorrelation function with a corresponding response function of the 


form R’,,(t) or Rya(t). 


8.4.7.3 Relaxation from a constrained equilibrium 


Relaxation: the general formula. 


The process of relaxation from a constrained equilibrium was consid- 
ered in the classical context in sec. 8.4.3 where, however, the process 
appeared to belong to a category distinct from the one of driven non- 
equilibrium processes considered in sections 8.4.4, 8.4.6. In the case of 
a driven process we assumed that the system is described by an equi- 
librium ensemble p,, pertaining to the intrinsic Hamiltonian H (H in the 
quantum context) at the initial time —oo, after which it is acted on by the 


weak time-dependent driving force h(t). 


In order that the relaxation process can be described in this framework, 
one can conveniently assume that the perturbation —Ah is turned on 
adiabatically, i.e., at an infinitely slow rate starting from t) — —co when 
the state of the system is p,,, in such a way that the Hamiltonian gets 
changed from H (h > 0 at t — —co) to H — Ah at t = 0, after which the 


perturbation is assumed to be switched off. This process of driving the 
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system corresponds to the following time dependence of the driving force 


h(t) (see fig. 8-5): 
h(t) = he™ (+t), (8-165) 


(check this out) where 7, is a slowness parameter that is assumed to be 
positive and infinitesimally small (7 — 0+). This means that, for t > 0 
the system relaxes from the constrained equilibrium pertaining to the 
Hamiltonian H—Ah and, at the same time, the process can now be treated 
within the framework of the theory of non-equilibrium evolution under 


driving. 


In this more general setting, the mean deviation AB(t) at t > 0 is obtained 
as (we refer, for the time being, to the classical context, where equa- 
tions (8-130a), (8-130b) apply; the quantum formula can be obtained by 
making use of the Kubo correlation function introduced below; we as- 


sume (B) =0, (A) =0, for the sake of simplicity) 


AB) =—1B [| FAB(AC))e"6(—7) 


=NB / - © (B()A(n))e. (8-166a) 


One can now replace the derivative 4 with —< (recall that (B(t)A(r)) de- 
pends on t,7 only through t — 7) and then integrate by parts to obtain (in 


the limit 7 — 0*) 


AB(t) = hB(B(t)A(0)) (t > 0), (8-166b) 
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which, as expected, is nothing but (8-117b), written under the assump- 


tion that 


Figure 8-5: Depicting the variation of the driving force h(t) with time for the 
process of relaxation from a constrained equilibrium; the driving force rises in- 
finitely slowly from zero at t + —oo to some specified value h at t = 0, whereafter 
the perturbation is switched off, so that h(t) = 0 for t > 0; the system relaxes from 
an initial constrained equilibrium (one pertaining to the Hamiltonian H — Ah) at 
t = 0 back to the equilibrium state pertaining to the intrinsic Hamiltonian H, 
the state it started from at t + —oo; with such a time variation, the relaxation 
process falls under the general category of driven non-equilibrium evolution con- 
sidered in sections 8.4.4 , 8.4.6. 


In order to work in the frequency domain, we start by noting that the 
mean deviation at any time t > 0 is given, in terms of x(t) by (refer 


to (8-130a), and the first equality in (8-161a)) 


0 
AB(t) = 2in | drO(t — T)x‘pa(t —T)e™, (8-167) 


co 


(check this out; recall that ,‘,,(t) depends only on the intrinsic Hamil- 
tonian H and on the observables A,B, and stands for a general feature 


of non-equilibrium evolution in the linear regime, having no reference to 
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processes of any particular type). Since we are interested in t > 0 when 
the system relaxes from the constrained equilibrium back to its intrin- 
sic equilibrium, the step function within the integral is 1, and we get, by 


Fourier-transforming ‘},(t — 7), 
R th f° ' " —iw(t—T) 1 
AB(t) = oe dw aTx'5 (we ers (8-168) 


The integration over 7 can now be performed, recalling that, by definition, 


lim,_,-.. e”” — 0, thereby arriving at the result 


AB(t) = - | diy XBAW) it (8-169) 


oo a AL 


(check this out; in this formula, h stands for the strength of the driving 
force at t = 0, when the constrained equilibrium pertains to the Hamilto- 


nian H — Ah). 


The above derivation is valid in both the classical and the quantum me- 
chanical descriptions, though the notation appears here to refer to the 
former (in the quantum context one will just have to interpret the observ- 
ables as operators), since we have not made use of any specific formula 


applicable to either of the two, to the exclusion of the other. 
The static susceptibility. 


In this context, it is worthwhile to recall the definition of the static suscep- 
tibility y,4 and to express it in terms of x‘, ,(w), thereby placing it within 


the general context of non-equilibrium evolution. For this we consider a 
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time dependent driving force h(t) of the following form (see fig. 8-6) 


h(t) = h[e"O(—t) + O(E)). (8-170) 


h(t) 


Figure 8-6: Depicting the variation of the driving force h(t) appropriate for 
defining the static susceptibility; the driving force rises infinitely slowly from zero 
at t — —oo to some specified value h at t = 0, whereafter it remains constant, 
i.e., h(t) = h for t > 0; the state of the system remains fixed at a constrained 
equilibrium (one pertaining to the Hamiltonian H — Ah) for t > 0 while the initial 
state at t — —oo is one of equilibrium pertaining to the intrinsic Hamiltonian H; 
considering such a time variation, the static susceptibility yz4 can be related 
to the dynamic susceptibility y‘,,(w); for notation, see text; in the quantum 
description, all observables are represented by operators denoted symbolically 
by hats overhead. 


In this case the perturbation is switched on adiabatically so that the 
Hamiltonian gets changed infinitely slowly from H at t > —co to H — Ah at 
t = 0, whereafter it remains constant at H — Ah, and the system attains a 
state of constrained equilibrium at t > 0. We now have, again by (8-130a), 


(8-155) 


AB(t) = 2th [ drx'p4(t — T)[O(—r)e” + O(7)]. (8-171) 


—oco 


Once again, the formula remains formally the same in the quantum con- 


text where, however, ‘;,(t) possesses a different interpretation. The in- 
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tegral can be evaluated in the limit 7 — 0* in a manner analogous to the 


derivation of (8-169), and one obtains, for t > 0 (refer to [98], chapter 2), 


AB) =" / auyXbale) (8-172) 


where the right hand side is seen to be independent of t, as it should be 


in the case of a static problem. 


In the classical description one can obtain an explicit expression for the 
time-independent response AB by considering a static perturbation —Ah 
and evaluating the average B in the perturbed equilibrium dis- 


tribution in the linear approximation, when one obtains 


AB = Bh(AB) (8-173) 


(check this out; we have abbreviated A-— (A), B—(B) by A, B re- 
spectively). In other words, the static response is proportional 


to the equilibrium correlation of A and B at equal time. 


The static susceptibility yz, is defined by means of the formula 


AB(t) = hxpa, (8-174a) 


which gives, in terms of y‘, ,(w) in the frequency domain representation, 


VBA = / dw Xealw) (8-174b) 


T Ww 


The relaxation function. 
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Going back to the relaxation problem and assuming that the initial pro- 
cess of slow switching on of the perturbation from t — —oo tot = 0 is iden- 
tical as in the process considered above in the static problem, so that the 
strength (h) of the driving force at t = 0 is the same in (8-165) as in (8-170), 


we can substitute in (8-169) the value of h obtained from (8-174a) as 
h= x54, AB(0), (8-175) 


where AB(0) signifies the steady value of the mean deviation attained 
in the static problem and also the value attained at t = 0 at the end 
of process of turning on of the perturbation in the relaxation problem, 
i.e., the initial value of the mean deviation (in the state of constrained 
equilibrium pertaining to the Hamiltonian H — Ah) from which the process 
of relaxation starts. In other words,the relaxation process is described by 
the formula 


AB(t) = xpal / do Xpalw) iat AB(0), (8-176) 


7 Ww 


where the static susceptibility yp,4 is given by (8-174b). 


1. Once again, the formula (8-174b) holds in both the classical and the 
quantum descriptions (with the usual qualifications regarding the dif- 


ference in interpretation and notation between the two). 


2. Given the initial value (AB(0)) of the mean deviation, formula (8-176) 
describes the course of the relaxation process entirely in terms of the 
dynamic susceptibility function ‘4 ,(w), where the term ‘dynamic sus- 
ceptibility’ is used in a generic sense (it applies specifically to yg4(w), 


the Fourier transform of the causal response function Rg a(t); see 
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sec. 8.4.8 for the relation between yg4(w) and x‘4 4 (w)) 


Analogous to the response functions Rg,(t) and R’',,(t) (or,equivalently, 


x',,(t)), one can define a relaxation function Rz4(t) as 


AB(t) = Realt)xpah. (8-177) 


On making use of (8-169) and (8-174b), one obtains the following expres- 
sion for Rga(t) 


t dw XBW) eT it 
_ Ww 


Real) = | dota) (8-178) 


where this expression remains formally valid for both the classical and 


the quantum descriptions. 


8.4.7.4 The Kubo correlation function 


The formulae (8-161b) and (8-163) express the dynamic susceptibility 
x'b4(w) in terms of the correlation functions Sga(w) and Sga(w) in the 
quantum mechanical description, which implies that the relaxation func- 
tion which describes the course of the relaxation process as expressed 
in (8-176), can be represented entirely in terms of these correlation func- 
tions. In the classical description, both $,4(w) and Sz,4(w) reduce to the 


classical correlation function given in (8-164a), (8-164b), and one obtains 


d Oe assica ok 
[classical :] Rga(t) = J dios ee ; (8-179) 


f dw iO Gissaieal (w) 
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(check this out) where S¢assical(w) Stands for the classical correlation func- 
tion mentioned above. A corresponding formula does not, however, hold 
for the quantum mechanical relaxation function with S¢lassicai(w) replaced 
with either the unsymmetrized or the symmetrized quantum mechanical 
correlation function defined in 8.4.7.2. On the other hand, one can, as 


mentioned above, define the Kubo correlation function 
Kea(t) = B- tf ani dr eM Ae B(t) ao fan dX(A(—iAh) Bit Vs (8-180) 
which satisfies relations formally identical with (8-164b) and (8-179) : 


(8-181) 


This shows that the quantum mechanical generalization of the classical 
correlation function S¢iassicai(t) = (B(t)A) is not the function Sz ,4(t) of (8-156a) 
or Spa(t) of (8-162a) but the Kubo correlation Kp,(t), which can be used 
conveniently to describe the course of relaxation of the system under 
consideration from a state of constrained equilibrium, or of more general 
time evolution under an external driving in the linear approximation. In 
the classical limit, the Kubo correlation function Ky3,4(w) reduces to the 


classical correlation function which is the Fourier transform of (B(t) A). 


Given the observable A, the expression 
B . 1 B Pa . 
| dX A(—iAh) = sf ape Ae (8-182) 
0 


0 


occurring in (8-180) is referred to as its Kubo transform. The Kubo corre- 
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lation function, defined in terms of the Kubo transform, is a very useful 
construct in obtaining quantum mechanical generalizations of classical 


formulae in linear response theory. 


8.4.8 Response and correlation functions: symmetry and 


analyticity 


Symmetry properties. 


We state without proof a number of important symmetry and analyticity 


properties of the response and correlation functions (see [98] for details). 


Of considerable interest is the function y(t) = +R,,(t), where both yf, 
and R,, are loosely referred to as response functions. More precisely 
speaking, ,‘,,(t) is commonly referred to as the dissipation function, 
since it determines the energy dissipation in the system under consid- 


eration in a driven non-equilibrium process (see sec. 8.4.9). 


The terminology regarding the various functions in the time and fre- 
quency domains introduced in the context of linear response theory 
does not follow a commonly accepted and precise usage, and is, at 
times, confusing; however, the mathematical notation and the pre- 
cise definition of symbols allow for no ambiguity; the terms used are 
just tags attached to the various symbols and can always be checked 


against the latter. 


From the definition of \‘, ,(t) and the expressions (8-130b), (8-149a), one 
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obtains the symmetry property 


Xpalt) = —X'4p(—t), (8-183) 


which tells us that ,,,(t) is not invariant under time reversal. Indeed, it 
is x‘,,(t) that defines the arrow of time, in being responsible for energy 


dissipation in a non-equilibrium process. 


Another property of y‘,,(t) that follows from definition and the classical 
and quantum mechanical expressions for FR’, ,(t) alluded to above is that 


x‘ga(t) is purely imaginary: 


* 


Xpalt) = —Xpalt). (8-184) 
This, however, does not determine the reality property of \‘4,(w) (at times 
referred to as the spectral function), obtained from x‘, ,(t) by Fourier trans- 
formation. However, the following relations hold as consequences of (8-183), 


(8-184): 


Xba) = —X'4p(-), XBalw)" = —XBa(—-w). (8-185) 


The reality property of y‘5,(w) is actually determined by the signatures of 


A, B under the operation of time reversal. 


The time reversal operator T plays an important role in determin- 
ing the analyticity and symmetry properties of the various response 
functions in the time domain and the frequency domain. An observ- 


able A transforms under time reversal to TAT, and we confine our 
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considerations to observables that may be characterized by a definite 


signature €, : 


TAT! =e,A, (8-186) 


where €,4 can be either +1 (positive signature, such as the position 
operator or the electric field) or —1 (negative signature, such as the 


momentum or the magnetic field). 


If (€4,€s) denotethe signatures of the observables A, B, then the following 


properties hold: 


X‘pa(W) = €4€BV'4—(w) (interchange of A, B), 
X'pa(w) = €4€By’p4(W) (complex conjugation), 


\'ba(W) = —eaeBy'pa(—w) (w > —w). (8-187) 


Knowing these symmetry properties of the dissipation function \‘4,(w), 
one can easily deduce the corresponding symmetry properties of the cor- 
relation functions Sg4(w) and S,,4(w) from the relations (8-161b), (8-163). 
These symmetry properties validate the Onsager reciprocity relations in- 


troduced in sec. 8.2.2. 


Analyticity properties. 
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Starting from the relation between yg4 and y‘,, in the time domain, 


xXBa(t) = 220(t)x’Ba(t), (8-188) 


and making use of the Fourier transform of O(t) (in an appropriate limit- 
ing sense), one can invoke standard results in complex analysis to arrive 


at the following important relation in the frequency domain 


+ix'e4(w), (8-189) 


= jj 
XBa(w) ee 


dus" Xba’) _ pf du x’pa(w") 


! a ! 
oo TOW — win oo TF Www 


where the symbol ? denotes the principal value of an integral. 


If the signatures of A,B are the same, as is often the case, then the 
second equality in (8-187) tells us that ‘3 ,(w) is real. One then concludes 


from (8-189) that it constitutes the imaginary part of xz4(w), 
X'ba(w) = Im yza(w). (8-190) 


At the same time, denoting the real part of yg4(w) as y‘p4(w), one obtains 


= dw X'pa(w") 
gr w—w 


vjsa(w) = Re ypa(w) =P i (8-19 1a) 


This integral relation possesses an inverse wherein ,’, ,(w), the imaginary 
part of yg4(w), appears as an integral transform of the real part ‘5 ,(w), 


sar du X’pa(o") 
cg w—w 


Xpa(w) =Im yza(w) =P i (8-191b) 


—oco 


These two important formulae are referred to as the Kramers-Kroénig rela- 
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tions. One infers, for instance, that the dynamic susceptibility y,4(w) is 


determined completely if one knows the dissipation function y‘ ,(w). 


Considering for the sake of completeness the case when the observ- 
ables A,B are of opposite signatures under the time reversal opera- 


tion, one obtains the complementary results 


Re xpa(w) = xpaw), Im ypa(w) = x’pa(w). (8-192) 


Additionally, making use of formula (8-189), one can see that the sym- 
metry properties of y‘, ,(w) expressed in (8-187) imply corresponding sym- 


metry properties of yg4(w). In particular, the first equality implies that, 


XBa(W) = €4€BpXaB(w) (interchange of A, B), (8-193a) 


which tells us that, if the observables A,B are of the same signature 


under time reversal, then 


XBA(W) = X4B(w). (8-193b) 


This expresses an important reciprocity relation: the response of the ob- 
servable B to a driving force coupled to the observable A is equal to the 
response of A to a driving force coupled to B. For instance, in the case of 
magnetic resonance, the magnetization induced in the y-direction due to 
a field imposed in the x-direction will be equivalent to the magnetization 


in the x-direction due to a field in the y-direction. 
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Incidentally, the fact that Rz,(t) = 0 for t < 0 implies that the Fourier 
transform yga(w), and hence its imaginary part y‘3,(w), is an analytic 
function of w for w lying in the upper half of the complex w-plane. In other 
words, the first equality in (8-189) holds for complex values of w, with 
XBa(w), x/64(w") analytic in the upper half plane, and the function x g4(w) 
for real w is obtained as the limit as the complex argument approaches 
the real axis from above. On evaluating yg4(w) for any specified system, 
one can check that its singularities are indeed confined to the lower half 
w-plane (see sec. 8.4.11.1for the concrete example of a damped harmonic 


oscillator). 


One obtains an explicit expression for the dynamic susceptibility ,4(w) 
in terms of matrix elements of the operators A, B (in the energy basis of 


the unperturbed system) by taking the Fourier transform of (8-149b) as 
XBa(w) = . i dte™*((B(t)A — AB(t))), (8-194a) 
0 


or, on making use of the complete set of intrinsic energy eigenstates, 


along with the Heisenberg representation of B(t), 
_ a ne twt —Bwm | iwmnt —twmnt 
XBaA(w) = iz d. i dte’"e [e BmnAnm — € AmnBrm|- (8-194b) 


In this expression the summation indices m,n label the complete set of 
intrinsic energy eigenstates (|m),|n)), Amn, Bmn Stand for the matrix ele- 
ments (m|A|n), (m|B\n), and wm = 2*,wm, = "==8= for various relevant 
values of m,n, with E,,, E, being the corresponding energy eigenvalues. 


Finally, Z stands for the initial equilibrium partition function correspond- 
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ing to the unperturbed Hamiltonian. On regularizing the integrals with 


e—™ (7 > 07), one arrives at the expression 


1 AinnB Baran Ar 
Ss —Bwm mn nm — mn nm 
xpalw) = 57 me Peareaaer ary 


(8-194c) 


The presence of the term i7 in the denominator of each term in the above 
summation ensures the analyticity property of yg4(w) mentioned above. 
From the physical point of view, the inclusion of i7 corresponds to an 
adiabatic switching on of the perturbation in (8-142), involving a time- 


dependent coupling h(t) = [e"O(—t) + O(t)|h, with 7 > OT. 


8.4.9 Energy absorption and dissipation 


We continue to consider a system in the linear regime driven by a set 
of forces h,(t) coupled to observables A; (i = 1,2,--- ,n, say), as a result 
of which the intrinsic Hamiltonian H is perturbed to the time-dependent 


Hamiltonian 
H'=H-YS- Ajhi(t), (8-195) 
t=1 


where the driving functions h;(t) (¢ = 1,2,---,n) are all real. The opera- 
tor representing the energy absorption by the system from the external 


agency responsible for the perturbation can be expressed in the form 
W= -{ dt \~ A;hi(t). (8-196) 
=e i=1 


Working in the Schrédinger picture, the expectation value of the energy 
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absorbed (we denote it by W) can then be written in the form 


w=- fo wohl (t)A;(t (8-197a) 


Here A,(t) stands for the non-equilibrium average of A; and is given by 
A(t) = Tr(a) Ai), (8-197b) 


where /(t) is the distribution function at time ¢ that arises from the initial 
(at t + —oo) distribution p,, (the equilibrium distribution pertaining to 
the intrinsic Hamiltonian H) in virtue of the non-equilibrium evolution of 
the system caused by the perturbation. In the linear regime, we have, 


by (8-150), 
A,(t) = (A;) ss y - glt =o nga ar, (8-197c) 
j —oO 


where (A;) is the equilibrium expectation value of A, (relative to the in- 
trinsic Hamiltonian H ), and R;; is the causal response function for the 
observable A; when the observable A; of the system is coupled to the driv- 
ing function h,(t) (refer to sec. 8.4.4.2; in the present context we consider, 
for generality, n number of driving functions (n = 1,2,---), but do not take 
into account the space dependence of observables and driving functions 
for the sake of simplicity). On substituting into (8-197a), one finds that 


the contribution coming from the first term in (8-197c) vanishes. 


In order to see this one has to perform an integration by parts in 


[°° dth,(t)(A;), when it reduces to — [°° dth;(t)2(A;) (where we make 
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the assumption that the driving function h;(t) goes to zero for t > 


++oo). This vanishes in virtue of the fact that the equilibrium expecta- 


tion value under the intrinsic Hamiltonian ((A;)) is time-independent. 
One thereby obtains, from (8-197a), (8-197c), 
W= -{ at f dt! S~ hi(t)hy(t)Rig(t — #). (8-197d) 
—oo —oo ij 


We next introduce the Fourier transforms of the response and driving 


functions and then perform the integrations over t,t’ , so as to obtain 
dw r 
W=- aii [em [ enble + 1) d(w + we) (w)(—tiwr) hi (wi)h§ (wa), 
(8-198) 


(check this out). Making use of the reality of the driving functions and 


carrying out the integrations over w),w2, we end up with 
dw, 
ca a / 5 hi (w)h;(w) (iw) x(w)ay, (8-199) 


(check this out). We now substitute w — —w within the integral, inter- 
change the dummy suffixes 7, 7 add the resulting expression with (8-199), 
divide by two, and once again make use of the reality of the driving func- 


tions to obtain 
1 dw, . 
Wea, ), / 5G (w) hj (w) (tw) [xig(w) — xje(W)], (8-200) 
a 


(check this out). Finally, making use of (8-193a), and the reality of R;;(t) 
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(which implies \;;(w) = xj;(—w)), one arrives at 
° a /< h; (w)hj(w) (iw) [xig(w) — eregxig * (&)], (8-201) 


(check this out) where ¢«; (i = 1,2,--- ,n) stands for the signature under 


time reversal for the observable Ax. 


Having set up the general formula for the energy absorption, we consider 
for the sake of concreteness the special case of a single driving function 
h(t) coupled to a single observable A. Noting that the square of the signa- 


ture of an observable under time reversal is +1, one obtains 


= f Felne)Px"w). (8-202) 


In this expression, y(w) stands for the Fourier transform of the response 
function R4,(t), which determines the mean deviation of A at time ¢ when 


A itself is driven by h(t), and y’(w) denotes its imaginary part. 


The above formula actually represents the energy dissipated in the sys- 
tem since it is an integral over the ensemble average of the energy ab- 
sorbed in a time-dependent non-equilibrium process. The ensemble av- 
erage gives us the energy lost within the system to all its various degrees 
of freedom. Since the energy dissipation in a non-equilibrium process 


has to be a positive quantity, we arrive at an important requirement to be 
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satisfied by the dissipation function ,"(w): 
wx" (w) > 0, (8-203) 


for all w (reason this out; consider a driving function h(t) such that h(w) 
is a delta-function). More generally, the matrix with elements wy;;(w) is 


to be positive definite. 


8.4.10 Dynamic structure factor: density correlations 


by neutron scattering 


An important special case of correlations in a system is that of density cor- 


relations which finds diverse applications in condensed matter physics. 


The great merit of the linear response theory is that it allows one to 
break loose from the confines of dilute gases and to obtain meaningful 


results for condensed systems. 


The density operator for a fluid made up of N particles is defined by the 


expression 


N 


n(r,t) = 5_ 6(r —x,(t)), (8-204) 


ga" 


where r,(t) stands for the position operator of the jth particle (j = 1,2,--- , N) 
at time ¢ (determined by Heisenberg evolution based on the intrinsic 
Hamiltonian of the fluid; the latter will be assumed to be in equilibrium at 


the beginning and end of the scattering process to be considered below). 
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We omit here, for the sake of simplicity, the hat symbol over the quantum 


mechanical operators appearing in the theory. 


Since the density operator depends continuously on the position vector 
r (which, however, is a parameter rather than an operator; more pre- 
cisely, we consider the density operator in the position representation, in 
which the former is diagonal), one has to keep track of this parameter- 
dependence in looking at density-density correlations, as mentioned in 
sec. 8.4.4.2 in the context of fields of operators. The perturbation involv- 
ing the density field results from a coupling to an external c-number field 
determined by the nature of the probe interacting with the fluid (we con- 
sider here a neutron beam as the probe). It will turn out that the spatial 
Fourier transform of the density field gives a convenient description of 
the scattering process (in addition, the Fourier transform from the time 


domain to the frequency domain will be made use of). 


We now confine ourselves to the specific context of an inelastic neutron 

scattering experiment where a parallel beam of thermal neutrons of mo- 

mentum /k; is made to scatter from a fluid (the target) and the scattered 

neutrons of various different momenta hk; are detected. The scatter- 

ing data are collected to work out the differential scattering cross-section 
do 


ade’ where dQ stands for an infinitesimal solid angle centered at the tar- 


get and de; for a small energy interval relating to the scattered neutron. 
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Following [98], one obtains the result 


d?a ae a m 
dQdep Vk; ‘27h? 


P|U(k) 7) 6(hw — Ey + E;) 
inf 


N 
pil(f| S— el" |2)/?. (8-205) 
j=l 


In this expression, V stands for the volume of the scattering fluid, m for 
the mass of the neutron, and U(k) for the spatial Fourier transform of the 
potential characterizing the interaction between a neutron and the fluid. 
In arriving at this expression one assumes a weak interaction between a 


neutron located at r and the scatterer atoms, of the form 
Hr) = faorue —r')n(r’), (8-206) 


as result of which the scattering process can be considered in the Born 
approximation, and the Fermi golden rule can be invoked to work out the 
cross-section. Further, the initial and final states of the neutron (|k;), |k)) 
are assumed to be decoupled from the corresponding states (|7),|/)) of the 
fluid, where one averages over initial states of the fluid |i) (probability 


e BE; 


pi = —z—) and sums over final states |f). Finally, the expression includes 


the factor |(f| >", e*™|2)/? 


, which relates to density correlations in the 
fluid. The delta function accounts for energy conservation, wherein hw = 
¢; — ef Stands for the energy loss of the neutron and equals the energy 
transferred to the fluid. It may be noted that the interaction (8-206) is 
of the linear response form where the weak field U(r — r’) couples with 
the density field of the scatterer, as a result of which the scattering cross 


section can be related to the density correlations in the fluid. 
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Making use of the integral representation of the delta function (6(hw — Ey+ 
Ei) = +f dten(™-*®r+9) one obtains, on invoking the formula for the 
Heisenberg evolution of the operators r;(t)(j = 1,2,--- , N), and performing 
the summation over the complete set of final states |) 


do _ 1 ea m 
AQdeE 5 «nh k; \Qah? 


)?|U(k)|?Sir(k, w), (8-207a) 
where S7(k,w), given by 
=o/. at f a rd (3) / ete —ik: (r— r’ )S(r, x’, t), (8-207b) 


stands for the spatial and temporal Fourier transform of the density cor- 


relation function 
Sp(r,r’,t) = (n(r, t)n(r’, 0)). (8-207c) 


Here the suffix ‘I’ is used to indicate that it is the total density n(r,t) 
that is involved in (8-207a)-(8-207c). One can refer instead to the density 


deviation from the equilibrium value 
dn(r,t) = n(x, t) — (n(r, t)), (8-208) 
in which case one obtains 


Sr(r,r’,t) = S(r,r’, t)(n(r, t)) (n(r’, 0)), (8-209a) 
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where 


S(r,r’,t) = (dn(r, t)dn(r’,0)), (8-209b) 


is the correlation of density deviations, commonly referred to as the ‘dy- 
namic structure factor’, this being a particular (and special) instance of 
its use in the more general sense of time correlation functions of various 


descriptions. 


The formula (8-207a) achieves a clear separation between the interac- 
tion term (|U(k)|”) and the intrinsic fluctuation term (S7(k,w), and hence 
S(k,w)). Neutron scattering experiments provide us with information 
about the latter and hence, through the fluctuation-dissipation theorem, 
about the density response function (or equivalently, about the dynamic 
susceptibility) with reference to the external perturbation. The dynamic 


susceptibility, in turn, is obtained independently in spectroscopic exper- 


iments that lend validity to the fluctuation-dissipation theorem itself. 


While neutron scattering experiments are of great value in the study of 
transport processes in solids by relating these to dynamic structure fac- 
tors and dynamic susceptibilities, light scattering is also of similar value, 
especially in studying the corresponding features of transport in liquids 
(recall the use of light scattering in the context of static properties of liq- 
uids by light and X-ray scattering in sec. 4.5.4). Light scattering is made 
use of in determining the transport coefficients of systems of diverse de- 


scriptions (for an outline, see [117], chapter 10). 
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8.4.11 Linear response theory: applications 


Applications of linear response theory cover a vast range, since this the- 
ory is of universal validity for systems close to equilibrium as much as the 
canonical distribution (or any other distribution equivalent to it) provides 
us with a universal description of properties of systems in equilibrium. 
Indeed, the former rests on the latter in an essential way. The time corre- 
lation functions describing linear response are all equilibrium expectation 
values relative to the intrinsic Hamiltonian of the system under consider- 
ation, and are thus, in principle, amenable to calculation with arbitrary 


precision. 


Computational methods based on the molecular dynamics approach ef- 
ficiently calculate these equilibrium expectation values pertaining to a 
wide range of non-equilibrium processes for a variety of systems, includ- 
ing condensed matter systems of great diversity. Many of these compu- 
tational approaches are based on effective theoretical schemes providing 
us with fruitful approximations to the desired time correlation functions, 
based on which the molecular dynamics computations are invoked, yield- 
ing satisfactory results. Originating in the nineteen fifties (refer to the 
seminal paper [82] by Kubo; see also [83]), linear response theory is nowa 


well-established and active area of theoretical and experimental physics. 


Experimentally, time correlation functions are inferred from a 
variety of scattering data that lead to dynamic structure fac- 
tors, the latter being the Fourier transforms of the correlation 


functions. In addition, the absorption spectra of weakly per- 
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turbed systems provide rich information about the response 
functions by locating the poles of the dynamic susceptibilities, 
the latter being the Fourier transforms of the former. The re- 
sponse functions and the time correlation functions, in turn, 


are related by the fluctuation-dissipation theorem. 


In this book, however, I have set myself the task of presenting basic prin- 
ciples of statistical mechanics in a coherent manner, without getting into 
the complex maze of applications (the ‘slide-down’ referred to by Feyn- 
man, see sec. 1.1.2) of any particular set of principles such as those 
involved in the linear response theory. In the following, I include brief 
accounts of only a few chosen applications of the linear response the- 
ory, just to indicate how the basic ideas of the theory can be related to 
characteristic features of specific systems. As I have already mentioned, 
applications of the linear response theory make up a vast territory, en- 


tirely out of bounds of this introductory book. 


8.4.11.1 The driven harmonic oscillator 


The linear response of a harmonic oscillator close to a state of thermal 
equilibrium is of great relevance in non-equilibrium statistical mechanics. 
Such an equilibrium state may be realized by keeping the oscillator in 
thermal contact with a reservoir at temperature 7. For instance, the 
oscillator in question may belong to a large assembly of oscillators whose 
state is described by a canonical ensemble at temperature 7, in which 
case the reservoir is made up of all the oscillators other than the one 


under consideration. The state of the latter, regardless of the state of the 
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reservoir is then described by a canonical ensemble at temperature T. 
We will be interested in the response of the oscillator to a weak driving 
force acting on it. The response will be found to be analogous to that of 
a damped oscillator acted upon by a fluctuating force in addition to the 


external driving. 


When we consider a subsystem (S’) belonging to a larger system (S), 
the state of the former, regardless of the state of the latter (under 
specified conditions or constraints), is referred to as its reduced state. 
Knowing the constraints imposed on §S, the time-evolution of the re- 
duced state of S’ can be worked out and is referred to as its master 
equation. The long time behavior of the solution to the master equa- 
tion can be seen on the average to tend to a state of S’ described by 
a canonical ensemble, much as a damped harmonic oscillator tends 
to the equilibrium state of its own. While it is difficult to arrive at the 
master equation when S’ and S are specified arbitrarily, an explicit 
equation can be arrived at when S’ is a harmonic oscillator and S is 
a reservoir made up of a large number of harmonic oscillators (the 
‘oscillator bath’). The problem of reduction from an oscillator bath 


will be briefly addressed in sec. 8.7.2. 


The damped harmonic oscillator under the influence of a fluctuating force 
constitutes a metaphor of great relevance for the behavior of macroscopic 
systems, but before outlining and explaining this metaphor we will ad- 
dress in concrete terms the linear response of the oscillator, close to an 


equilibrium state described by a canonical ensemble, to a weak external 
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driving force, which is our principal concern in the present section. 


The intrinsic Hamiltonian of the oscillator (mass m, frequency w») in the 


absence of driving is given by 
1 
A= -—+ mga", (8-2 10a) 
m 


in familiar notation. We assume that the oscillator is perturbed by a weak 


driving force, due to which the Hamiltonian gets modified to 


A 


Hl! = H — h(t). (8-210b) 


Assuming further that the state of the oscillator is described by a canon- 
ical ensemble at temperature JT at an initial time in the distant past 
(t — —oo) we look at the mean deviation in z at time t. According to 


the linear response theory, this is given by the expression 


Agi) = [ Ryo(t — T)h(r)dr, (8-21 1a) 


(oe) 


where R,,..(t) is the relevant causal response function in the time domain. 
Taking the Fourier transform of both sides, the above formula assumes 


the form (refer to (8-153)) 
AF(w) = Xex(w)h(w), (8-211b) 


where we follow the convention of denoting the Fourier transform by the 
same symbol as the function itself, transforming only the argument (t > 


w). In the above expression, y,;(w) is related to the dissipation function 
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(x..(w)) relevant to the problem by a relation of the form (8-189), while 


x4,(w) itself is the Fourier transform of y”,,(t) (= +R’,,(t)) given by (refer to 


formula (8-149a)) 


Xen (t) = —[4(0), 2], (8-211c) 


where i(t) stands for the operator resulting from « = 7#(0) due to time 
evolution, in the Heisenberg picture, under the intrinsic Hamiltonian H 
through time t. Invoking the Heisenberg equations of motion for i(t), p(t) 


one obtains, in analogy with the classical solution, 


&(t) = £(0) coswot + 


A(0 
p(0) sin Wot, (8-212a) 
™MmWo 


which can now be used to evaluate the right hand side of (8-211c). One 


thereby gets 


iW 


Xine(t) = 


: sin wot. (8-212b) 
2iMwo 


The above expression for x“,,,(t) is consistent with a number of reality 
and symmetry properties of ,‘, ,(t). For instance, x‘, ,(t) is, by defini- 
tion, the trace of the product of a Hermitian and an anti-Hermitian 
operator and hence, is imaginary. Further, it satisfies the relation 


(see sec. 8.4.8; see also [98], chapter 3) 


x‘pa(t) = —eaeBx'4p(—t), (8-213) 


where €4,¢€g are the signatures of A, B under time reversal. This im- 
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plies that, in the special case B = A, y’, ,(t) has to be odd under time 


reversal, which is actually seen to be the case here. 


It is now straightforward to take the Fourier transform and obtain 


7 


Xen (Ww) = can [o(w — wo) — 0(w + wW)], (8-214) 


(check this out). The symmetrized correlation function in the frequency 


domain can then be obtained from (8-163) as 


Th 


MWo 


Stu) = Phu 


coth ( 


)[5(w — wo) + 6(w + w)]. (8-215) 


This could be obtained directly from the definition of S,,.(w) (= f dte™*[5((@2(t)+ 
&(t)£))]) (check this out), thereby verifying the fluctuation-dissipation the- 


orem for the harmonic oscillator. 


Finally, the energy absorbed by the oscillator from the external driving 
agency (and dissipated into the reservoir with which the oscillator is in 


thermal contact) is obtained from (8-202) as 


2 
We [Pwo)I (8-216) 


m 


(check this out). In other words, energy is absorbed only from that 
Fourier component of the driving force which coincides with the natural 
frequency of the oscillator. This is a special instance of the more general 
statement that the absorption spectrum of a system under a perturbation 


—Ah(t) coincides with the singularities of the dissipation function y",,(w). 
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Since the equations of motion of the harmonic oscillator are linear in the 
position and momentum variables, one can construct explicit solutions 
for various characteristic properties of the oscillator as we find in the 
present section, these solutions being, moreover, formally analogous to 


the corresponding classical expressions. 


The formal similarity between the classical and the quantum mechanical 
harmonic oscillator carries over to the case of the damped oscillator as 


well when one obtains 


(8-217) 


where y > 0 stands for the damping constant. The singularities of y,..(w) 
and y”.(w) are seen to lie in the lower half w-plane, as required by causal- 
ity, and one obtains the result (8-214) in the limit y — 0*. The energy 
absorbed from the external driving force can be seen to peak at the res- 
onant value of the driving frequency w = wo, the natural frequency of the 


oscillator (check this out). 


8.4.11.2 Charge transport: the Kubo formula 


Linear response theory is of great importance in that it makes possible the 
theoretical and numerical computation of transport coefficients — char- 
acteristic features of materials of various types whose knowledge leads to 


diverse technological applications. 


Notable among such transport coefficients is the electrical conductivity of 


materials, describing their response to externally imposed weak electric 
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fields. Linear response theory leads to a general formula for the conduc- 
tivity that can be adapted to a variety of material types so as to lead to 
the value of the conductivity of a material of any given specification, this 


being an instance of what are generally referred to as Kubo formulas. 


Consider an assembly of N number of charged particles having charges 
qi (t= 1,2,--- ,N), with an imposed electric field E, uniform in space. The 


intrinsic Hamiltonian H of the system is perturbed due to the field to 
N 
H'=H-Y_- qfti- E(t), (8-218) 
i=1 


where r; stands for the operator representing the position vector of the 


ith particle and where the electric field is treated classically. 


The interaction Hamiltonian with the external field is written in the 
dipole approximation, where P = >=; ati stands for the dipole opera- 
tor of the system of charged particles. The intrinsic Hamiltonian H 


includes the contribution from long range forces between the charges. 


Writing the interaction Hamiltonian as —P - E(t), one can express the re- 
sponse of the system in terms of the current density j resulting from the 


perturbation. However, it is more convenient to look at the mean devia- 


dP 


tion of the operator J = “,» to be referred to as the ‘integrated current’ 


AJa(t) =~ / dry,,p,(t — 7) Ey(r), (8-219) 


je aes 
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where a,7(= 1,2,3) denote spatial indices and y = ‘y’yp with Cartesian 
components y,,p, stands for the causal response function tensor. The 
Fourier transform of the above formula is to be compared with the defi- 


nition of the frequency-dependent conductivity tensor ‘a’ (w)as 
Aja(w) = S~ day(w) Ey(w), (8-220) 
Y 


where o,,(w) (a,y = 1,2,3) stand for the Cartesian components of the 


conductivity tensor. 


For a homogeneous system acted upon by a weak uniform electric field, 
the (non-equilibrium) expectation value of j(w) is independent of the po- 


sition vector r, in which case one has the relation 
AJ(w) = VAj(w), (8-221) 
where V stands for the volume of the system. 


Classically, the current density for a continuously distributed system is 
defined as j(r) = p(r)v(r), where f(r) stands for the charge density and v(r) 
for the velocity of a charge element at r. On the other hand, J = “& is 
given by fd@rp(r)v(r). Thus, if the current density is independent of r, 


the integrated current J equals Vj. A corresponding relation holds for the 


quantum mechanical expectation values of the operators as well. 


Thus, more explicitly, o)(w) = ¢xJ,p,(w). Linear response theory then 
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tells us that o,,(w) is related to yjp/,(w), the imaginary part of y.,(w), as 


Con(w) = — lim / dia! Xap, (8-222a) 


> lim, nm wl —w—in 


(check this out; refer to formula (8-189), the right hand side of which is 


equivalent to lim, 9+ [“@ Xba“) We rewrite the above relation as 


nT w!—w—in 


(8-222b) 


1. du’ 1X 7,.Py (w’) 
Soy(w) = = 1 


Viyoot J im w!(w! —w — in)’ 


In this formula, iw’y’;, p(w’) can be seen to be the Fourier transform of 
XJaJ,(t) (check this out; recall the relation J= a which leads us to the 


relation 


d : " / 
cole) =a f aaa Man (8-223) 


w’ —W— in 


where the limit 7 — 0* is left understood. Finally, we express y’;, ; («’) 
in terms of the Kubo correlation function of the integrated current com- 
ponents J,,J, by making use of the first equation in (8-181), thereby 


arriving at 


pas ofS Fiat) (8-224) 


Qn (w! —w—in) 


As a corollary, one obtains the DC conductivity tensor as 
iB f du’ aes: ee , 
Gu 0) = “7 i Oy Mass (w NP _ + imd(w’)]. (8-225) 


Taking the trace of the matrix o,,,(0) one obtains the longitudinal conduc- 
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tivity [98] (the DC conductivity in the case of an isotropic material) as 


fee 8 
c=, De Faa(0) = a d K,,7,(w = 0), (8-226) 


(check this out) where we have made use of the fact that a Kubo auto- 
correlation function (K;,7,(w) in the present context) is even with respect 
to w (refer to the first equality in (8-185)). This can be expressed in the 


form of an integral in the time domain 


o= o » | “ dt Ky, 7, (t). (8-227) 
a v0 


Formula (8-227), which is a special case of (8-224), constitutes an in- 
stance of what are generally referred to as Kubo formulae (or the Green- 
Kubo formulae), where transport coefficients are related to time integrals 
of correlation functions between relevant currents. Classical limits are 
obtained by making use of the fact that, in this limit, the Kubo functions 
reduce to the respective classical correlation functions, i.e., in the time 


domain, 
[classical limit :]| Kga(t) > (B(t)A). (8-228) 


Note that the time integral in (8-227) arises because of the zero-frequency 
limit having been taken in the frequency-dependent conductivity ten- 
sor (8-224). This is what is involved in the definition of transport coeffi- 
cients since the latter characterize the time-evolution of long-wavelength 


and low-frequency perturbations over the stationary and spatially uni- 
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form equilibrium states of systems, such space- and time-dependent 
perturbations being commonly referred to as hydrodynamic modes (re- 
fer to sec. 8.4.11.3 below). In the present case of a charged system being 
driven by a uniform electric field the long-wavelength limit is built in our 
model (8-218) from the very beginning, and only the low-frequency limit 
needs be taken in arriving at the relevant transport coefficient. Typically, 
a transport coefficient can be expressed in the form of an integral of a 
time auto-correlation function by means of a Kubo formula as in (8-227) 


above. 


8.4.11.3 Linear response theory: transport coefficients 


As mentioned above, the derivation of the Kubo formula for the electrical 
conductivity of a system is an instance of the application of the linear re- 
sponse theory to the calculation of transport coefficients that characterize 
non-equilibrium time-evolution of systems under a range of experimen- 
tal conditions which is a broad one, though still limited in scope. In 
the first place, transport coefficients are defined within the scope of the 
linear response theory where thermodynamic forces and fluxes bear a lin- 
ear relationship with one another. In addition, these relate to conserved 
quantities pertaining to the intrinsic Hamiltonian that vary slowly un- 
der weak perturbations, the variations being essentially independent of 
small scale spatial correlations. In other words, the transport coefficients 
assume relevance in describing variations of conserved quantities under 
weak perturbations, where the variations typically occur over large spa- 
tial distances and time intervals. Such variations are said to define the 


hydrodynamic description of non-equilibrium time-evolution, and the rel- 
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evant observables define the hydrodynamic modes of the system under 
consideration. Indeed, the most general description of non-equilibrium 
processes would involve myriads of ‘modes’, each with its own spatial 
and temporal scale, among which only a very special set of modes are 
the hydrodynamic ones, being characterized by transport coefficients re- 
lating the thermodynamic forces to the corresponding fluxes, all these 
forces and fluxes being defined in macroscopic terms (i.e., in terms of 


phase space averages). 


The linear response theory provides us with formulae for the transport 
coefficients in terms of phase space averages of microscopic variables 
where, moreover, all these variables are of a standard form that can be 
expressed in terms of correlations between appropriate currents. In the 
quantum theoretic formulation, these are precisely the Kubo correlations 
introduced in sec. 8.4.7.4, where the formula (8-224) constitutes an ex- 


ample of a transport coefficient expressed in terms of a Kubo correlation. 


Of course, such formulae for the transport coefficients constitute only 
the first step in an actual computation of a transport coefficient for any 
specified system of interest, since one has then to obtain expressions for 
the relevant currents in terms of the microscopic variables of the system 
under consideration and, to evaluate the relevant equilibrium averages 
with these expressions in place. For instance, starting from (8-224), one 
can set up the currents either for a dilute gaseous system, or for a metallic 
conductor of a specified structure and then work out the equilibrium 
averages, arriving at entirely different results in the two cases. The final 


stages of such calculations are routinely performed nowadays by invoking 
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the approach of molecular simulations [140]. However, these details of 
calculations relating to the transport coefficients for specific systems will 


not be taken up in this book. 


Electrical conduction or magnetic response are commonly referred to as 
being caused by mechanical perturbations, and are distinguished from 
diffusion or thermal conduction, the latter being referred to as processes 
caused by thermal perturbations. However, linear response theory allows 
one to work out the transport coefficients relating to all these processes 
(whether caused by mechanical or thermal perturbations) within a com- 
mon framework. While a mechanical perturbation modifies the micro- 
scopic equations of motion characterizing a system, usually by modifying 
its Hamiltonian, a thermal perturbation, generally speaking, modifies the 
boundary conditions that result in the development of gradients in rele- 
vant macroscopic variables (see [33], chapters 1,4,5). However, not much 
is to be read in the distinction between the two types of perturbations and 
the associated processes since the theories leading to the expressions for 


the relevant transport coefficients have a considerable degree of overlap. 


Generally speaking, transport coefficients appear as constants in a linear 
relationship between sets of macroscopically defined currents and gradi- 
ents, the latter being specified as phase space averages. The relevant cur- 
rents and gradients, in addition to being time dependent, possess a space 
dependence as well (the case of a uniform electric field considered in 
sec. 8.4.11.2 corresponds to a uniform potential gradient), in which case 
the non-equilibrium phase space averages of observables appearing in 


the theory are, in general, space- and time-dependent as in sec. 8.4.4.2. 
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Within the framework of the linear response theory, it is then found con- 
venient to work in terms of spatial and temporal Fourier transforms, the 
latter characterized by respective wavelengths and frequencies. The hy- 
drodynamic modes correspond to variations of conserved quantities due 
to weak perturbations, the variations being, precisely, ones with long 


wavelengths and low frequencies. 


As a simple example of how a transport coefficient can be expressed in 
terms of the time correlation function of the microscopic current asso- 
ciated with a conserved quantity, we consider below the case of self- 
diffusion in the classical theory. The case of the shear viscosity will be 
taken up next, again in the classical theory, within the framework of a 


simplified approach. 


A. Diffusion coefficient. 


We work out the Kubo formula for the self-diffusion coefficient of a fluid 
by following two distinct, though related, approaches. The first of these 
starts from the phenomenological Fick’s law where a linear relationship 
is assumed between the concentration gradient and the diffusion current, 
both defined in macroscopic terms, and then relates it to a microscopi- 
cally defined probability density. Starting from an initially non-uniform 
distribution of concentration, the diffusion coefficient is obtained in the 
long time limit in the absence of external driving when the uniform equi- 
librium distribution is approached. In the second approach, which makes 
direct use of the linear response formalism, one introduces an external 


driving so as to balance the diffusive process of removal of the concentra- 
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tion gradient. The condition for a steady state to result from this balance 
is then combined with the linear response formula to obtain the diffusion 


coefficient. 


Adopting the first of the two approaches referred to above, we consider the 
diffusive motion of an assembly of tagged molecules in a fluid close to an 
equilibrium configuration where there is a small concentration gradient 
of the molecules. The tagged molecules are assumed to move indepen- 
dently of one another in the background of all the remaining molecules 
of the fluid (thus, collisions among the tagged molecules are assumed to 
be rare). Since the number of the tagged molecules is conserved, their 


concentration n(r,t) is related to the associated current density j(r,t) as 


On : 
OE +A-j=0. (8-229) 


The macroscopically defined quantities n and j can be interpreted as 
the non-equilibrium averages of the microscopic phase space func- 


tions 
p(r,t) = 2 d(x —rj(t)), j(r,t) = Dd vid(r — r,(t)), (8-230) 


where the summation is over all the tagged particles in the assembly. 


These formulae are consistent with (8-229). 


The current j can be related back to the concentration n by means of the 
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phenomenological equation 


[Fick’s law :] j = —DVn, (8-231) 


where D stands for the diffusion constant. This equation, defining the 
diffusion constant D, holds for low concentrations of the tagged particles, 
and can be combined with (8-229) to give a closed equation (the ‘diffusion 
equation’) for n: 


On(r, t) 
ot 


= DV’ n(r,t). (8-232a) 


Since the tagged particles move independently of one another, we can 


concentrate on one single particle and choose the normalization 

/ nr, t)d@r = 1. (8-232b) 
Additionally, we choose the initial condition 

n(r, 0) = 6 (r), (8-232c) 


where 6(°)(r) stands for the three dimensional 5-function. These equations 
are consistent with the following probabilistic interpretation of n(r,t): 
n(r,t)d®r gives the probability density of the particle being located at r 
at time ¢, given that it is located at the origin at the initial time 0. Denot- 
ing this probability density by the new name P(r,t) we have 


OP(r,t) 


= 2 : 
P= DVPlr. 8): (8-233) 
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One can make use of this equation to work out the mean squared dis- 


placement of the tagged particle at time t: 
(r°)(t) = 6Dt. (8-234) 


In this expression (referred to as Einstein’s formula), the angular brack- 
ets ((---)) on r? signifies the equilibrium average since the diffusion equa- 
tion (8-233) holds in the long run (t — oo) when short-term correla- 
tions die out and the diffusive motion of the tagged particles is well- 
approximated by the hydrodynamic description. In the t — oo limit, re- 
placing the equilibrium average by the non-equilibrium average results 


in only a small correction on the right hand side of the above equation. 


Check formula (8-234) out. The left hand side is given by [ r?P(r,t)dr. 
Differentiate with respect to t, make use of (8-233), and then integrate 
by parts twice in succession. Finally, integrate back with respect to 


tie 


To establish the proportionality of n(r,t) and P(r,t) on a more rigor- 


ous (and less intuitive) basis, see [29], chapter 3, [19], chapter 8. 
On the other hand, noting that the instantaneous displacement of the 


tagged particle in the diffusive process is the time integral of its velocity 


(or, more precisely, of dv(= v(t) — (v(t)))), one obtains 


(r\(t) _ [ dt! [ dt" v(t") . v(t"), (8-235) 
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where we have made use of the fact that (v(t)) = 0. A few steps of algebra 


then shows that, in the long time limit (see [101], chapter 21) this gives 
(r?\(t) & 2 f (v(0) - v(t')) dt’. (8-236) 
0 


Comparing (8-234) and (8-236), one obtains the following expression for 
the diffusion coefficient in terms of an equilibrium time correlation func- 


tion: 
D=- [wo -v(t))dt. (8-237) 


Since we are considering a single tagged particle (an assembly of VN num- 
ber of independent particles may also be considered, with appropriate 
scaling in N) the velocity v can be identified with the integrated micro- 


scopic current 
v=J= [erieeo, (8-238) 


(refer to the second equality in (8-230)), when we obtain the diffusion 


coefficient in the Green-Kubo form 


p= | *(5(0) - J(t))at. (8-239) 


Following the second of the two approaches mentioned earlier, we now 
proceed to obtain the same result by making direct use of the linear re- 
sponse theory outlined in sections 8.4.4 and 8.4.5. Here we first assume 


that the tagged particle is subjected to a weak force causing a perturba- 
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tion to its intrinsic Hamiltonian H to 
H' =H —hO(t)z, (8-240) 


which corresponds to the action of a constant driving force in the x- 
direction for t > 0. The mean deviation of the x-component of the velocity 
(v.), assuming that the system is in equilibrium (relative to th intrinsic 


Hamiltonian H at t = 0), is then (refer to (8-129b)) 


Ad, (t) = 0,(t) — (vz) = Bh / dr(x(0)u2(t — 7) 
sa Bhs | * (ual 0)ve(t))dt, (8-241) 


where the last equality is obtained in the limit of large t, when short-term 


correlations are eliminated. The mobility in the x-direction is then 
_ (Oy) pda _ 
0 


We now relate this to the diffusion coefficient D, by imagining a situa- 
tion where the ‘drift current’ due to the constant driving force (for t > 0) 
balances the diffusion current generated by a gradient in the particle 
concentration. Making use of the fact that the net current in this imag- 
ined situation is zero, one arrives at the Einstein relation (see, for in- 


stance, [83], [18], chapter 7) 
La. (8-243) 


On now taking into account the isotropy of the fluid in which the diffusion 
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occurs, we finally obtain 


D(= (De £9, 4D) = ; / ” (v(0) - v(t))at, (8-244) 


which is precisely the same formula as (8-238) leading to (8-239). 
B. Shear viscosity 


Following [140] (chapter 13) I outline below a simplified approach for 
the derivation of the Kubo formula for the shear viscosity of a fluid, 
where we make direct use of the linear response theory as in the case of 
the diffusion coefficient via (8-241) (for an alternative derivation starting 
from the linearized Navier-Stokes equation, analogous to the derivation of 
the diffusion coefficient by starting from the diffusion equation (8-232a), 
see [101], chapter 21). 


A fluid flows under an arbitrarily small shearing force, but is character- 
ized by an internal resistance in virtue of which the stress tensor varies 
in proportion to the rate of change of shear. Consider, for instance, a flow 
in which the x-component of the fluid velocity (u,) varies with the y-co- 
ordinate in the vicinity of a chosen point (relative to a Cartesian system 


with the chosen point as origin, at which the velocity is zero) at a rate 


we = 7 (say), the other components of the velocity and its gradient being 


all zero. Thus the velocity field in the vicinity of the origin is given by 
u(r) = yee, (8-245) 


where ¢é, stands for the unit vector along the x-direction (é, is similarly 
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defined). Newton’s formula for shear viscosity then states that the x-y 
component of the pressure tensor at a point at which the velocity gradient 


(of the type indicated above) is 7, is given by 
Poy = —I7; (8-246) 


where 7 denotes the coefficient of shear viscosity (commonly referred to, 
in brief, as the ‘viscosity’; a compressible fluid is characterized by a sec- 


ond viscosity coefficient, namely, the bulk viscosity). 


See [101], chapter 17, for an explanation of the term ‘pressure ten- 
sor; the pressure tensor was encountered earlier in this book in 
sec. 8.3.6; it is the same as the stress tensor, taken with a negative 


sign. 


In the formula (8-246), the pressure tensor and the velocity field (and 
hence the gradient y) are defined in a macroscopic sense, and one now 
needs microscopic definitions to which the macroscopic ones are related 


in the sense of an ensemble average. 


For a fluid made up of N number of particles (assumed to be identical for 
the sake of convenience, each with a mass m) with position and momen- 
tum variables {r),r2,--- , rv}, {pi,P2,--: , pw}, the intrinsic Hamiltonian is 


of the form 


2 


N 
H=)*\*}40(n,1,---,rw), (8-247) 
i=1 


2m 
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where U(r1,r2,--- , ry) stands for the interaction energy of the particles. 


The external force field setting up the flow modifies the equations of mo- 
tion of the particles, and a consistent choice for the microscopic velocities, 


based on the macroscopic relations (8-245) would be 


= = + 7(ri + éy)éxh(t), (8-248a) 


since this gives the correct ensemble average, as (p; = 0) in the equilib- 
rium ensemble. In this equation of motion h(t) stands for a driving func- 
tion which we assume to be of the form 0(t), while the small parameter 


characterizing the perturbation is y. 


The other half of the set of the microscopic equations of motion is to be 


such as to conserve the phase space volume, and can be written as 


Di = Fi — yPiyéch(t) 


=F, — 7(pi: éy)é2h(t), (8-24 8b) 
where 
pec" (8-248c) 
Or; 


is the force on the ith particle (¢ = 1,2,---,N in equations (8-248a)- 
(8-248c)) due to all the other particles in the fluid. 


1. Notice that the above equations of motion are not truly microscopic 
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ones since the constant y in these is chosen as the macroscopic ve- 
locity gradient. In other words, these are only effectively microscopic 
equations. Moreover, in our present simplified approach, these equa- 
tions are of the non-Hamiltonian type (see, however, [96], section 2.2; 


[33], chapter 4, includes a more complete theory for shear viscosity). 


2. The equations of motion (8-248a), (8-248b), though non-Hamiltonian 
in nature, nevertheless conserve the phase space volume and 
imply the existence of a conserved quantity analogous to energy 
([140], chapter 13). This is in contrast to the basic fact that vis- 
cosity is a dissipative process. The same problem is inherent in 
the derivation of the diffusion coefficient by the use of the Hamil- 
tonian perturbation described by (8-240). In reality, a molecular 
dynamics simulation of viscous flow or diffusion based on these 
perturbations quickly result in an increase of temperature of the 
system because of the pumping of energy into it by the external 
perturbation, which does not get dissipated into a reservoir. In 
other words, the derivations in the present section are notional 
ones that do not include the feature of dissipation, but lead to 
correct expressions for the transport coefficients nevertheless 
since these make use of equations of motion that effectively sim- 


ulate the dynamical features of viscous flow and diffusion. 


The dissipative current j(z) = j(r!, p!%!) defined in (8-138) then works 


out to 
1 
a IE i - Ex) (Pi + Ey) + (Fi + éx) (ti - ey)], (8-249) 
14=1 


where we have, for the sake of convenience, included the small pertur- 
bation parameter y (the velocity gradient) in the definition of 7. This 


is a slight departure from our earlier notation where the perturbation 
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parameter was denoted by hf and was included in the driving function 
h(t) = h@(t) (refer to (8-240)). In the present context, we take h(t) = O(t), 


as mentioned above. 


The dissipative current obtained above is closely related to the pressure 
tensor P,,, defined in microscopic terms. The latter has to satisfy the 
consistency requirement that its ensemble average has to reduce to the 
macroscopically defined pressure tensor P,,,. This leads to the following 


expression for P,,,, (refer to [140], chapters 5, 13): 


N 
1 
ee Da Pi « éx)(Bi - @y) + (Fi « x) (ti « ey)], (8-250) 


Pir 


where V stands for the volume of the fluid (reason this out; consider 
a volume element in the fluid in the form of a small rectangular paral- 
lelepiped and work out the net x-momentum transferred per unit time 
through a face perpendicular to the y-axis). In this expression, the fluid 
is assumed to be macroscopically uniform, though not isotropic (because 


of the velocity gradient). 


In other words, the dissipative current is given in terms of the pressure 


tensor by the expression 


J=WPry, (8-251) 


where the arguments made up of the phase space co-ordinates are left 
implied. Since the non-equilibrium average of P,,, at any specified time t 


is the macroscopically defined pressure tensor that tends, at t — oo, to 
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the pressure tensor corresponding to the velocity gradient 7 as required 


by the Newton formula (8-246), one obtains 


l _ 
= == lim Paylt)s (8-252a) 


oy t-0o 


where the bar on top of a phase space function denotes a non-equilibrium 
average at any specified time, as in the earlier sections. Since the equi- 
librium average of the pressure tensor in the absence of driving is zero, 
one obtains, in keeping with our earlier notation, 


= —~ lim [Pry(t) — (Pzy)] = —= lim APay(2), (8-252b) 


ay t-00 ay t-00 


where AP,,,(t) denotes the mean deviation of the x-y component of the 


pressure tensor at time f. 


We now invoke (8-140) and insert h(t) = O(t) (recall that the perturbation 
parameter h (= 7 in the present context) has been absorbed in the defini- 
tion of the dissipative current; refer to (8-251)) so as to arrive at our final 


result 
n= BV | dt(Pry(0)Pry (t)), (8-253) 
0 


(check this out). This is again of the Kubo form, since the right hand 
side is in the form of an integral over the equilibrium auto-correlation 
function of the dissipative current at unequal times. Though the equilib- 
rium average (P,,,) vanishes, its auto-correlation at unequal times does 


not, telling us that even at equilibrium, there arise short-lived anisotropic 


1105 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


fluctuations, due to which P,,, is characterized by a non-zero correlation 


time. 


Complete derivations for mechanical and thermal transport coefficients, 
including the thermal conductivity, of liquids are to be found in [33], 


chapters 4, 5; see also [101], chapter 17. 


8.5 Linear response and non-equilibrium ther- 
modynamics 


In sec. 8.2 we looked at the basic principles of non-equilibrium thermody- 
namics where we considered a continuous distribution of small subsys- 
tems assumed to be in instantaneous local equilibrium and the locally 
defined thermodynamic variables were assumed to vary slowly in space 
and time. The system as a whole was assumed to proceed through a suc- 
cession of states during which there takes place a production of entropy 
that can be expressed in terms of locally defined affinities and fluxes. The 
latter are related linearly by means of a set of kinetic coefficients which, 
in turn, are assumed to satisfy among themselves a set of symmetry re- 


lations. 


In the present section, we will see how all this can be interpreted in terms 
of the linear response theory outlined above in sec. 8.4. However, for 
the sake of simplicity, we will consider an isolated system made up of a 
number of discrete subsystems instead of continuously distributed ones. 


Further simplification is achieved by referring to just two subsystems in 
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interaction, of which one will be assumed to be the system of interest (to 
be referred to as %) while the other will be assumed to play the role of a 
reservoir much larger than ©. In the following, we base our presentation 


principally on [93] and [107]. 


The total entropy of the composite system, which is assumed to be an 


isolated one, is expressed as the sum 


Stot({Xtort) = SUX }) + Sr({Xr}), (8-254a) 


where the sub-index ‘tot’ is used for the composite system and ‘R’ for 
the reservoir, while no separate sub-index is used to denote quantities 
pertaining to ©. For each of the three systems under consideration, 
the entropy is a function of a set of independent extensive variables 
(X,, Xo,---,Xuv, M << N, denoted collectively by {X}; N stands for the 
number of particles making up the system %) where, by the property of 


extensivity, 


AX ktot = X,+ Xp (k= 1,2,--- ,M). (8-254b) 


Irreversible processes occurring within the composite system sufficiently 
close to an equilibrium state can be looked upon as a passage through 
successive constrained equilibria of the system () and the reservoir. 
Since the composite system is assumed to be an isolated one, the fol- 


lowing constraint is satisfied in any such process: 


OX ktot = 0, ((ie., )) OXkR = —dOX, (k = 1, 2, eee ,M). (8-254c) 
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We define 


_ AS OS ASR 


A = = 
— OXe  OX~ OXGn 


(k = 1,2,---,M), (8-255) 


where (8-254a), (8-254c) have been made use of. In an equilibrium con- 
figuration, all the A,’s vanish (65; = 0), while a non-equilibrium process 
is driven by a non-vanishing value of the A;’s, which act as affinities re- 
lating to various possible types of processes. Formula (8-255) expresses 


the affinities in terms of the intensive parameters 


Os OSR 
tk = ay TR = BX k=1,2,---,M), (8-256a) 
as 
io tyhes (8-256b) 


which corresponds to the relation (8-11) in the present simplified context 


of a composite system made up of the system » and the reservoir R. 


The fluxes generated in the system of interest © in response to affinities 


A, are represented by 


ie 
Jy = au (k =1,2,---,M), (8-257) 


in terms of which the entropy production in the composite system is given 
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by 


AStot OS Sp \ dX 
dt =) (ox, Xan) dt 


k 


yes (8-257b) 
k 


which corresponds to (8-10b) in the present simplified context. The sec- 
ond law of thermodynamics therefore implies that the expression )°, A,.J; 


has to be a positive definite one. 


A basic assumption of non-equilibrium thermodynamics is that the affini- 


ties and fluxes are related linearly, i.e., in other words, 
Jy = S 7 Lg Ar (k = 1,2,-+- ,M), (8-258a) 
I 
which corresponds to formula (8-12) in the present simplified context. 


The kinetic coefficients L;, are thus defined as 


ad 
p= Cree (k,l =1,2,--- ,M), (8-258b) 


where the sub-index 0 denotes that the derivative on the right hand side 
is to be evaluated in the limit of all the affinities and currents going to 


Zero. 


Based on the principle of microscopic reversibility, Onsager established 


that the kinetic coefficients L,; appearing on the right hand side of this 
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equation satisfy the reciprocity relation 


Ly = Lip (1, k= 1,2,--- ,M). (8-258c) 


The reciprocity relation, along with the positive definiteness of the en- 
tropy production, serves as the basis of a large number of consequences 
of practical relevance following from the principles of non-equilibrium 
thermodynamics. In particular, the second equality in (8-257b), along 
with the linearity relations (8-258a), implies that the quadratic form de- 
fined by the coefficients L;; has to be positive definite, i.e., in other words, 


the eigenvalues of the matrix L with elements L;; have to be positive. 


We will now interpret the above thermodynamic formulae in terms of ones 
defined in the context of linear response theory in non-equilibrium statis- 
tical mechanics. The time dependent thermodynamic variables X;,(t) (k = 
1,2,---,M), for instance, are interpreted as non-equilibrium averages of 
corresponding phase space functions X;,. Recalling that a non-equilibrium 
average is denoted by a bar overhead (refer to sec. 8.4), the quantity X;,(t) 
occurring in the context of non-equilibrium thermodynamics is thus in- 
terpreted as X;,(t). We assume for the sake of simplicity that the phase 
space functions X;,, are defined in such a way that (X;,) = 0 (k = 1,2,--- ,M) 
(recall that the symbol (-) is used to denote equilibrium averages), in 
which case X;(t) stands for the same object as AX;(t) in the notation 


of sec. 8.4. 


Consider now any two of the phase space functions from among the set 
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of functions representing the thermodynamic variables X,, X2,--- , M, say 
X;, and X; (k,l =1,2,--- ,M). Assuming a perturbation of the form —A(t)X, 
to the Hamiltonian of the system ©, where h(t) = hO(—t)e” (j — 0*) the 
response in X;, can be expressed as (refer back to section 8.4.7.3 and to 
formula (8-117b), and recall our simplifying assumption that (X;) = 0 (i = 
{2.2<2)) 


X,(t) = Bh(X,(t)X1(0)) (t > 0). (8-259) 


As we indicated in sections 8.4.3 and 8.4.7.3, this formula can be looked 
upon as the interpretation of Onsager’s regression hypothesis in terms 
of non-equilibrium statistical mechanics. Here one interprets the time 
dependence of X; as the way a thermodynamic variable regresses as the 


system under consideration relaxes from a constrained equilibrium. 


Within the confines of the linear response theory, we make the assump- 
tion that the variation in X;,(¢) will, in general, be described by a linear 


equation of the form 
d — 2 
Fault) = - », M,;X;,(t), (8-260) 
J 


where M,; (k,l = 1,2,---) are a set of phenomenological coefficients, and 
the equations as such are of a phenomenological type that will be found 
below to be consistent with the basic assumption of non-equilibrium 
thermodynamics pertaining to the linear relationship between affinities 
and fluxes (sec. 8.2). In reality, a formula of the form (8-260) involves 


a coarse-graining in the sense that variations occurring over short time 
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scales (characteristic of the microscopic dynamics) are ignored. 


On making use of the regression formula (8-259) in the phenomenological 


relation (8-260) one obtains, in the case of a perturbation coupled to 


RS) 
© (Xe(H).Xi(0)) = — SP Me Xj(4)Xi(0)) (1 = 1,2,--- M). (8-261) 


We now invoke the symmetry properties of our system © under time rever- 
sal. Assuming for the sake of simplicity that all the phase space variables 
X;, (k = 1,2,---,M) as also the Hamiltonian of the system have positive 
signature under time reversal (refer back to section 8.4.8; note that the 
present considerations are confined to the classical context), one obtains 


the relation 


(Xx(t)X1(0)) =(Xe(—#)X1(0)) 


Gie,) (%()54(0)) =OGOXO) (1S 1,2,—-M), (8-262) 


where the second equality follows from the time translation invariance of 
the equilibrium correlations. It is to be noted that the above symmetry 
property of the time correlation functions is based on the assumption that 
there is no magnetic field influencing the system dynamics. In keeping 
with the observations made in sec. 8.2.2, the symmetry relations have to 


be appropriately modified if a magnetic field is present. 


Taking the time derivative on both sides of the second equality in (8-262) 
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and making use of (8-261), one arrives at 
So Mig Xj(t)X1(0)) = S> Miy(Xj(4)X4(0)), (8-263a) 
z j 
which implies, in particular (putting ¢ = 0), 


SS) Muj(XjXi) = D> My(X;Xx) (ky = 1,2,--- ,M), (8-263b) 

j j 
where, for any k,/ = 1,2,--- ,M we use the notation (X;X7) = (X;(0)X7(0)). 
We shall now relate the expression }/, M,j;(X;Xi) with the kinetic coeffi- 
cients L;,; (k,j = 1,2,---) occurring in the phenomenological relations be- 


tween the thermodynamic fluxes and affinities assumed in non-equilibrium 


thermodynamics, and defined as in (8-258b). 


Looked at from the point of view of statistical mechanics, the affinities 
and currents are all related to phase space functions and are character- 
ized by fluctuations. On making use of (8-257a) in (8-258a), multiplying 
both sides of the resulting equation with X,(0), and then evaluating the 


equilibrium expectation value of both sides, one obtains 
© (Xx(t)X;(0)) = Yo bil %5(0) (k,j =1,2,---,M). (8-264a) 
or, making use of (8-261), 
— 7 Maj(Xi(t)X(0)) = Yo La Ail) X,(0)) (kj =1,2,--+,M). — (B-264b) 
l I 


Making the particular choice t = 0, one obtains the above relation in the 
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form 
= > Mri (XX = Da Pak (A,X (k j= = 1, 2,° +). (8-264c) 


It now remains to evaluate the correlation (A,X;) (/,j = 1,2,---) occur- 
ring on the right hand side of the above formula. On using the defini- 


tion (8-255) of the affinities, one obtains 


as aSn 
(A, Xj) = (qn X3) ‘Xin 


a A, (8-265) 


The second term on the right hand side goes to zero in the limit of the 
reservoir size going to infinity in comparison with the size of © since the 
fluctuations of the reservoir variables go to zero. The first term on the 
right hand side can be evaluated by noting that the relevant averages 
are over the entire phase space, and can be evaluated by averaging over 
the macroscopic variables X,, X2,--- ,X,, by making use of the result that, 
given a set of values of these variables, the accessible phase space volume 


compatible with these values is given by 
PX Toe (8-266a) 


where I'p(= hh? N!) is a constant. Hence an expectation value of some 


function, say, F({X}) of these variables can be expressed as 


(F({X})) ac) lax, F({Xpeke SUX, (8-266b) 
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where C is a normalization constant given by 


=j / [exits Oy. (8-266c) 
j 
With this result on equilibrium expectation values, formula (8-265) can 


be written as 


Haas de 
S({X}) 
(ALX;) Hc f Toxics ax 


kp S({X}) 


—Chp / [Tax — De (8-267a) 


or, performing an integration by parts on the right hand side, 


(AX; y= chs f TL axale [eta Sx] * CX; = Chey f TL axed 8, 
k 


(8-267b) 


where, in the first term on the right hand side, [],’ stands for a product 
over k ranging from | to M, with the value / excluded, and X, , X;' denote 
the lower and upper bounds of the phase space variable X;. In the second 
term on the right hand side we have made use of the fact that X), X; are 


independent variables for | 4 j. 


The first term on the right hand side vanishes in the thermodynamic limit 
N -— oo because, (recalling the definition of C from (8-266c)), it is of the 
form r where /’ is an integral over a subspace of dimension // — 1 in the 
space of variables {X}, which is equivalent to a subspace of dimension 


N —v in the phase space of dimension N where v(< N) ~ N (recall that 
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M << N) while, on the other hand, J is an integral over the entire phase 


space. Thus, what remains on the right hand side of (8-267b) is, simply, 
(A, X;) = —kpoy, (8-267c) 


where we have made use of the definition (8-266c) once again. 


This finally gives, from (8-264c), 
Lyj = ~ » Mya(X1X;), (8-268a) 
and, from (8-263b), 
Lig Lee ht = 128i alts (8-268b) 


thereby verifying the reciprocity relation postulated by Onsager. The co- 
efficients L;,, appear as the elements of a matrix L which, in virtue of the 


reciprocity relation, is a symmetric one. 


In closing this section, we consider the rate of change of entropy produc- 


tion in a non-equilibrium process. 


Making use of the linear relation (8-258a) in the second equality of for- 


mula (8-257b), one obtains 
S=S 0 Lp, AAR, (8-269) 
ik 


where the entropy production appears as a quadratic form involving the 
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kinetic coefficients L;,. The positive definiteness of the entropy produc- 
tion implies a number of restrictions on the matrix of kinetic coefficients 
(in addition to the symmetry relation (8-268b)) including, in particular 


the positivity of the diagonal elements 


We now look at the rate of change of the entropy production, given by 


dS OA; OA; 
= 2S Li An = 290 SS, (8-27 1a) 
a jk cae jl oo 


where (8-257a) and (8-258a) have again been made use of. Invoking the 


definition (8-255) of the affinities, one obtains 


dA; 0,08 ASR 


ws = -271b 
Ok, OX OK; OX.” . 


in which the second term on the right hand side is zero since the reservoir 


variables in 


ax are independent of the system variables X;. This gives 


dS os 

—=2) ~~~ JJ 8-271 

dt “£4 AXj0X,°9°" 0 
where the second derivative is to be evaluated with all the parameters 
X, set at their equilibrium values. The principle of maximum entropy of 
the equilibrium state (as compared with the entropies of non-equilibrium 


states, interpreted as states in constrained equilibrium, with slightly dif- 


ferent values of the parameters X; (j = 1,2,---,M)), implies that S isa 
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concave function of the X,’s close to equilibrium, i.e., 


(8-272) 


regardless of the currents J, (k =1,2,--- ,M), the equality sign being valid 


only at equilibrium (i.e., J, = 0 for all k). 


This, of course, is just another expression of the principle of maximum 
entropy which states that for any non-equilibrium process close to an 
equilibrium configuration (where the system under consideration may be 
assumed to pass through a succession of constrained equilibria, so that 
the entropy of each such state can be defined as the sum of the entropies 
of subsystems, all in local equilibrium), the entropy production is positive 
(S > 0) while the rate of change of the entropy production is negative. The 
latter characteristic of a non-equilibrium process implies that the system, 
when disturbed away from an equilibrium configuration, tends to return 
to the latter. If the system be an isolated one, it actually tends asymptot- 
ically to the equilibrium configuration, when the entropy production and 


its rate of change both become zero. 


If, on the other hand, it be prevented from actually returning to the equi- 
librium state by imposing one or more constraints, then it may evolve 
in a different way. For instance, if an affinity be maintained at a non- 
zero value then the system may asymptotically tend to a non-equilibrium 
steady state, described by stationary values of the thermodynamic pa- 
rameters, including non-zero constant values of the affinity and of one or 


more flux(es). It is the non-zero values of one or more affinities and fluxes 
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that distinguishes a steady state from an equilibrium one. 


In the case of a steady state, the rate of change of entropy production 
vanishes while the entropy production itself continues to be positive (and 
constant), and the concavity of S implies that the entropy production 
has actually reached a minimum value compatible with the constraints 
imposed on the system. In other words, a near-equilibrium steady state 
is characterized by a minimum entropy production principle (refer to [107], 


[93] for further discussions and simple examples). 


In summary, we have, in the present section, seen that the linear re- 
sponse theory provides us with a means of interpreting the basic princi- 
ples of non-equilibrium thermodynamics in terms of those of non-equilibrium 
statistical mechanics. The interpretation includes the definitions of non- 
equilibrium fluxes and affinities, where the former are related to the lat- 
ter through the kinetic coefficients that have been seen to satisfy the reci- 
procity relations assumed in non-equilibrium thermodynamics on the ba- 
sis of time reversal symmetry. Finally, we have seen that near-equilibrium 
irreversible processes are characterized by positive definite values of the 
entropy production and negative definite values of the rate of change of 
entropy production, where the latter implies the minimum entropy pro- 


duction principle for near-equilibrium steady states. 


In setting up this interpretation of non-equilibrium thermodynamics in 
terms of the linear response theory, we have looked at a simplified sit- 
uation involving a system () in weak interaction with a reservoir R. 


However, one can generalize to a chain of discrete systems interacting 
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through their common walls of separation, or further, to a continuous 
distribution of small subsystems making up the system of interest (as in 
sec. 8.2) within which there occur irreversible processes through a suc- 
cession of near-equilibrium configurations that can be treated as contin- 
uously changing states of constrained equilibria. Once again, the reci- 
procity relations, the positive definiteness of the entropy production, and 
the negative definiteness of the rate of change of entropy production fol- 


low from the theory (refer to [93], [15]). 


8.6 Hydrodynamic modes in linear response the- 


ory 


The linear response theory provides the framework within which the hy- 
drodynamic modes of a system can be related to the microscopic dynam- 
ics and relevant transport coefficients can be evaluated, as mentioned in 


sec. 8.4.11.3. 


One sets up the balance equations in terms of microscopically defined 
quantities and then introduces a set of phenomenological coefficients in 
term of which the rates of change of the non-equilibrium averages of 
the local densities are related with affinities, the latter being likewise 
expressed in terms of the non-equilibrium averages. One thereby obtains 
the constitutive equations for the system under consideration. These are 
then introduced in the balance equations where, once again, quantities 
of interest are expressed as expectation values. In the end, one arrives 


at a set of coupled linear equations describing the hydrodynamic modes 
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of the system in the asymptotic regime of large t. This is the approach 
outlined in sec. 8.4.11.3 in connection with tagged particle diffusion and 
shear viscosity (see also sec. 8.4.11.2 where the derivation of the Kubo 


formula for the electrical conductivity is outlined). 


The coupled equations are typically of a diffusive nature (refer to sec. 8.2.3.3) 
and can be solved in terms of normal modes by making use of Laplace 
transformation, from which relevant correlation functions and dynamic 
structure factors can be worked out. Relating these to scattering data 
in inelastic neutron scattering, the transport coefficients can be obtained 


and compared with results from the relevant Kubo formulas. 


As a simple instance of this exercise (in addition to those in sec. 8.4.11.3) 
we consider a fluid made up of particles with spin magnetic moments, 
in which the interaction between the particles can be assumed to be 
spin-independent. This constitutes a simplified model of helium-3 in the 
normal phase ([40], chap2, [98], chapter 5; helium-3 forms a superfluid 
phase in the mK range of temperature owing to a BCS-type pairing of 
helium-3 atoms). The spins are assumed for the sake of simplicity to 
be two-state scalars pointing either ‘up’ or ‘down’, and the absence of 
spin-spin interactions means that spin-flip processes are of no relevance. 


Denoting by S; the spin of the ith particle (i = 1,2,--- , N), one has 


(6) SO, 455) = 0e8" 09 =12\-s+ ND, (8-273) 


where S” stands for the time-independent squared spin of each particle. 


The macroscopic variable we will be interested in is the local magnetic 
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moment density 
M(r,t) = 5_ S,6(r — r,(t)), (8-274) 


where r;(t) is the time-dependent position operator of the ith spin in the 
Heisenberg picture (we omit the hat symbol over operators; while our 
considerations are in the quantum context here, classical considerations 
proceed along a similar line, with entirely analogous results). With the 


Hamiltonian written as 


N N 
Pi 1 
the Heisenberg equation for M((r),t) is seen to be 


out) ee Vv [pi(t)5(r — rj (t)) + 6(r — ri(t))pi(t)] 


i=1 


=-—V-j(r,t), (8-276a) 


where the magnetization current density j(r, t) is given by the anti-commutator 


j(r,t) = + a Si{p;(t), 6(r — ri(t)) }4. (8-276b) 


This is the operator form of the balance equation of the magnetization 


density from which one can verify that the total magnetization 
M= / D®rM(r, t), (8-277) 


is conserved, i.e., (r,t) constitutes a hydrodynamic mode, as it should. 
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Taking the non-equilibrium average of M(r,t) (in the following, we use 
the notation (-)noneq to denote a non-equilibrium average; recall that this 
was denoted earlier by a bar over the symbol representing the observable 


quantity under consideration), we obtain the balance equation in the form 


OM (r, t) jaoney 
Ot 


= —V- (j(r, t)) noneq: (8-278a) 


We now introduce the macroscopic constitutive relation 


(062) yaseg = = DV (MD) naney, (8-278b) 


where D stands for a phenomenological diffusion constant. The simplify- 
ing feature of this spin diffusion problem is that there is no coupling with 


other macroscopic affinity or flux. 


Introducing the constitutive relation on the right hand side of (8-278a), 
we get the diffusion equation 


O(M (r,t) noneg 


a SV (MOEA) cee, (8-278c) 


Here and below M(r,t) will stand for M(r,t) — (M(r,t))eq, based on the 


observation that (M(r,t)).4 = 0. 


We now make use of the Fourier-Laplace transform 


M(k,z) = | dte’* ; d® re™*™(M(r, t)) noneqs (8-279) 
0 
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(note that the sub-index ‘noneq’ has been dropped on the left hand side for 
the sake of brevity), where the complex frequency z is to lie in the upper 
half plane for the integral on the right to converge. Here we consider an 
infinitely extended system so that boundary conditions (other than that of 
vanishing at infinity, and integrability) are not to be taken into account. 
On making use of (8-278c), one obtains the solution 


a 


Mi z+iDk? 


M(k,t = 0), (8-280a) 


(check this out). The pole on the negative imaginary axis is characteristic 
of a diffusive process. In the present simple case, one obtains by inverse 


Laplace transformation the familiar solution 
M(k, t) = e?*"*M(k,t = 0), (8-280b) 


(compare with (8-24d), where the notation is slightly different). In the 
case of an initial delta-function magnetization density at r = 0 one ob- 
tains, by inverse Fourier transformation, the characteristic Gaussian 


spread for diffusive processes 
M(r, t) = Mo(4nDt)~2e7™. (8-280c) 


One observes from (8-280b) that a mode with wave number k is damped 
in time 7, ~ p;s. which becomes large for small k as compared with typical 
collision times in the case of liquid helium-3 even at quite low tempera- 
tures. As mentioned several times in earlier paragraphs, this is typical 


of hydrodynamic modes that describe the dynamics of densities of con- 
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served quantities which spread out slowly over large spatial distances 


while all other modes get damped at much shorter times. 


We now relate the above with results in linear response theory established 
earlier. Considering the relaxation of an initial fluctuation of the magneti- 
zation density with wave vector k, we obtain, by taking the Laplace trans- 
form of (8-177), the fluctuation of M in the complex frequency domain 


as, 


M(k, z) = Raus(k, z)M(k,t = 0) (8-281) 


where we have made use of (8-175) and where ?Ry,;(k,z) is the Laplace 
transform of the relaxation function Ry,(k,t) given, in terms of the static 


susceptibility \\,;(k) and the dissipation function x‘, (k, w), by (refer to eq. (8- 176)) 


Rur(k, t) = ui Lf dus xiv (k, Ww) |, (8-282) 
xu (k) T WwW 


Here the relaxation problem refers to an initial fluctuation given by the equi- 


librium state of the perturbed Hamiltonian H — hM(k,t = 0), and the static 


susceptibility for wave vector k is defined as the response {M(k)) to the above 


perturbed Hamiltonian imagined to be held fixed. 


Combining (8-280a) with (8-281) one obtains 


a 


Pa IDR e2ee) 


Ru(k, z) = 


On comparing with the Laplace transform of (8-282), one finds finally [98] 
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the following expression for the dissipation function 


w Dk? 


Xiu (k, w) = x(k) w? + (Dk) (8-284a) 


At the same time, the dynamic susceptibility \,;(k, z) related to the dissi- 
pation function as in (8-190), is given by 


iDk?x m(k) 


ete (8-284b) 


xu(k, z) ae 


Since thermodynamic stability requires y,;(k) > 0,D > 0, we see that the 


wx‘, (k,w) > 0, as it should be (refer to sec. 8.4.9). 


Once the dissipation function is known, one can obtain all other cor- 
relation functions. In particular, the hydrodynamic form of the Kubo 
correlation (the first equality in (8-181)) is given by 


Dk? 


Z 
Ky(k,w) = BMS De (8-285) 


The hydrodynamic form of the spectral functions is typically a Lorentzian 
wit width Dk?. Experimentally, the diffusion coefficient D is determined 
from neutron scattering data, while the theoretical expression is given by 


the Green-Kubo formula [98]. 


8.7 Non-equilibrium statistical mechanics: re- 


duced systems 
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The content of the present section is based principally on [83], [150], 


[140] and [14]. 


8.7.1 The Langevin equation 
8.7.1.1 The Langevin equation: introduction 


A class of situations in non-equilibrium statistical mechanics can be con- 
veniently described and explained by analogy with the Brownian motion 
of a particle of large size (such as a pollen grain) suspended in a liquid 
held at a fixed temperature. The pollen grain is large in comparison with 
the molecular size and the average molecular separation in the liquid, 
but is small in comparison with the bulk of the liquid. Looked under a 
microscope, the pollen grain (the ‘Brownian particle’) shows an irregular 
dancing motion (as depicted pictorially in fig. 8-7) due to impacts of the 
liquid molecules arising from equilibrium fluctuations of the latter, while 
this motion of the Brownian particle does not alter the state of equilib- 


rium of the liquid as a whole. 


The forces acting on the Brownian particle can be classified into two 
types: (1) the forces arising from the irregular but frequent impacts with 
the liquid molecules, and (2) forces of external origin, if any, an instance 
of such external influence being the gravitational force exerted on the 
Brownian particle. The impact forces acting irregularly but almost in- 
stantaneously have a dual effect, namely one causing a random irregular 
motion (the Brownian motion) and the other impeding any systematic 
motion that the particle may possess. The latter is recognized as the vis- 


cous resistance to the motion of the particle and appears as a systematic 
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Figure 8-7: Pictorial illustration of the Brownian motion of a particle of large 
mass suspended within a liquid bath B in equilibrium at some given temperature 
T; in virtue of the equilibrium fluctuations of the microscopic states of the liquid, 
the Brownian particle executes an irregular dancing motion due to the impacts 
of the liquid molecules, where the force of impact has a random and a system- 
atic part, of which the former may be assumed to obey (8-287); the state of the 
liquid remains unchanged while the motion of the particle can be described by 
the Langevin equation (8-286); in view of its effect on the motion of the particle, 
the systematic part of the random force of impact can be said to be resistive 
in nature; in the absence of an external force, the particle eventually shares 
the equilibrium fluctuations of the liquid; S denotes the instantaneous position 
of the Brownian particle, the liquid molecules around which are schematically 
shown with dots; the dashed circles represent subsequent positions of the par- 
ticle, joined with line segments generating an irregular zig-zag path. 


impeding force (in contrast to the one causing the irregular motion) in 
virtue of the fact that the impacts of the liquid molecules are frequent 
and short-lived in comparison to the relatively slow systematic motion of 


the particle owing to its larger inertia. 


Confining our attention to the motion of the Brownian particle along a 
single axis (the x-axis) of a co-ordinate system (the motions along the 
other two axes are independent and similar), its equation of motion can 


be expressed in the form 


mo = —yu + F(t) + Rid), (8-286) 


where m stands for the mass of the particle and v for its instantaneous 
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velocity. The resistive force is assumed to be proportional to the instan- 
taneous velocity (as in the case of the viscous drag on a body moving 
through a stationary fluid), with a constant of proportionality 7. The ex- 
ternal force F(t) is assumed to be independent of x, though a force field 
derived from a potential can also be considered for the sake of generality 
(see below). Finally, R(t) stands for a force described by a random vari- 
able (referred to as the ‘noise’), responsible for the Brownian motion. The 


statistical characteristics of R(t) will be described below. 


Before exploring the consequences following from eq. (8-286), we pause to 
see what its relevance might be in the bigger context of non-equilibrium 
statistical mechanics. Though we have referred to a single Brownian par- 
ticle, one may generalize to the case of a subsystem (S; the Brownian 
particle in the present context) interacting with a bigger system (B; the 
liquid), forming a composite system (C), where the subsystem may addi- 
tionally be subjected to some external influence such as the force F' in 
eq. (8-286). Assuming that S is, in some sense, small compared to B, and 
that the latter is in equilibrium at some temperature T, the interaction 
between S and B may have a dual significance in respect of S, involving a 
random fluctuation and a systematic resistive force causing a dissipation 
of energy as it responds to the external force. The effect on B, on the 
other hand, can be ignored on the assumption that S is a small system, 


not capable of perceptibly altering the state of equilibrium of B. 


This paradigm of a subsystem interacting with a bigger system in equi- 
librium whereby typical observables pertaining to S are found to exhibit 


correlated fluctuations (see below) is relevant for a large class of prob- 
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lems in statistical mechanics. The fluctuations are a consequence of the 
equilibrium fluctuations in the bigger system B with which the system S 
interacts, while the correlations determine the response of S to external 
driving forces, if any. In the absence of an external force, the correlations 
die down as S assumes the state of equilibrium compatible with that of 
B. As we see below, such behavior is analogous to that of the Brownian 
particle described by eq. (8-286) where velocity correlations pertaining to 
the particle determine its response to external forces and, at the same 
time, account for the dissipation of energy and its return to equilibrium 


in the absence of external forces. 


The two aspects of the fluctuating forces on S caused by its interaction 
with B, one appearing as a systematic effect and the other as a random 
one (represented respectiely by the first and the third terms on the right 
hand side of (8-286)) appear to be distinct from each other only at a 
phenomenological level, and arise fundamentally from a common origin, 
namely the equilibrium fluctuations of B, and hence are related. It is 
this relation that is expressed by the fluctuation-dissipation theorem. In 
sec. 8.7.1.2 below, we will get to see the specific form that the theorem 
assumes in the context of equation (8-286) and then explain how it fits in 


the more general formulation of sec. 8.4.4. 


Notice that in the equation (8-286), there is no overt reference to the large 
number of degrees of freedom of the liquid in equilibrium with which the 
Brownian particle is in interaction. Instead, the effect of the liquid (the 
‘bath) degrees of freedom is represented by the constant y and the ran- 


dom force R(t) where the two are related by the Fluctuation-dissipation 
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theorem (FDT). This is an instance of reduction where the composite sys- 
tem made up of the Brownian particle and the liquid is reduced to the 
Brownian particle alone whose dynamical variable v (the position co- 
ordinate x enters implicitly) features explicitly in the equation, in which 
the effect of the bath degrees of freedom is represented by means of a few 
variables introduced phenomenologically. This constitutes an instance of 
a general procedure where a subsystem (S) of a composite system (C) is 
reduced to the former alone, with the effect of the bath (B) represented 
in some effective way , reproducing the statistical behavior of S in in- 
teraction with the bath. In the process, the bath variables get spirited 
away and the complex problem of describing the behavior of S in inter- 
action with B is replaced with a simpler one. This approach of reduction 
is employed in several classes of problems in various schemes of approx- 
imation, of which the Langevin equation constitutes an example of sub- 


stantial importance. 


Incidentally, depending on the physical situation at hand, the process of 
reduction may involve various degrees of approximation, and this may 
show up in the resulting reduced equation in the form of residual com- 
plexities of various levels. Indeed, one may have to resort to approxi- 
mation schemes in order to work out the consequences of the reduced 
equation itself. For instance, the equation (8-286) is obtained as a rea- 
sonable approximation describing the behavior of a Brownian particle in 
a liquid when the former is of a large mass such that the time scale 
associated with its motion is large compared to the time interval of the 


impacts of the bath molecules. This constitutes the simplest instance of 
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a Langevin equation which is linear and of the Markov type, the latter 
qualification denoting the fact that the systematic force represented by 
the first term on the right hand side of (8-286) does not involve memory 
effects. The simplification also relates to the statistical features that the 
random force R(t) is assumed to be characterized by. For a Brownian 
particle of small mass, or for a system S of more complex description, 
one may have to deal with Langevin equations without these simplifying 
features. An effective way to work out the consequences of a large class of 
Langevin equations is by way of formulating and analyzing the associated 
Fokker-Planck equations. In all cases, however, numerical computations 
employed cleverly in association with analytical schemes provide one with 
much useful information regarding the behavior of the subsystem under 


consideration. 


8.7.1.2 The classical Langevin equation: consequences 


We will now work upon the simplest of the family of Langevin equations, 
namely, the equation (8-286) (referred to as the classical Langevin equa- 
tion) where the random force is assumed to have the following charac- 
teristic features: (1) it has zero mean and (ii) its autocorrelation is in the 


nature of a delta-function, i.e., 


(R(t)) =0, (R(t) R(t’) = 2Bo(t -t’)), (8-287) 


where the constant B (the ‘strength’ of the random force) will be seen to 
be related to y by the requirement of equilibrium of the bath at some 


specified temperature 7. It is this relation between the systematic and 
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the random parts of the fluctuating force on the Brownian particle that 
constitutes the statement of the FDT in the context of (8-286). In the 
above formulae, the angular brackets denote the average over possible 
realizations of the random force R(t), which is referred to as an ‘average 
over the noise’. Since the random force is assumed to represent the ef- 
fect of equilibrium fluctuations in the liquid, and the Langevin equation 
relates the random force to the time course of development of the observ- 
ables of the Brownian particle (or more generally, of the subsystem S), 
the same angular brackets applied to such observables will denote the 


corresponding equilibrium expectation values. 


As mentioned earlier, the separation of the impact force between a system- 
atic and a random part is of a phenomenological nature, subject to the 


requirement of thermodynamic equilibrium of the liquid bath. 


The random force R(t) is assumed to describe a Gaussian random pro- 
cess that is stationary in character and is completely described in terms 
of its first and second moments specified in (8-287), all the higher mo- 
ments being determined by these two. The Gaussian nature of R(t) is 
a consequence of the fact that the force in any given short time interval 
arises due to a very large number of impacts acting independently, each 
impact being of an extremely short duration. This implies that the Cen- 
tral limit theorem can be applied in determining the statistical features 
of R(t). The Langevin equation (8-286) being linear in nature, this feature 
carries over to the velocity v(t) which can also be described as a Gaussian 


random process. 
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The Langevin equation is an instance of a stochastic differential equa- 
tion describing the time evolution of the stochastic variable v under 
the influence of a noise, where the time evolution is said to describe 


a stochastic process. 


The FDT can now be derived by referring to the solution of (8-286) that 


appears as 
x 1 q x 
v(t) = e~™*v(0) + al drem"-) R(r), (8-288) 
0 


in which the external force is assumed to be absent (i.e., / = 0) and the 
initial time has been set at t = 0 without loss of generality. This explicitly 
represents v(t) as a random process whose mean and correlations can 
all be worked out by making use of (8-287). In particular, squaring and 


taking the mean over the distribution of the random force, one finds 


ay B 2 

(v(t)?) =e ™*y(0)? + ane [l1—e7m™*], (8-289) 

(check this out). Incidentally, the solution (8-288) holds for t > 0 since 
causality demands that the random force at any given instant of time 
cannot determine the velocity at an earlier instant. One can now go over 
to the limit t — oo so as to determine the mean squared velocity in the 
distant future when the systematic effect of the fluctuating force is ex- 
pected to die down (recall that the external force has been assumed to be 


absent, as a result of which (8-288), read with the first equality in (8-287), 
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implies (v(t)) + 0 for t > oo). One thereby obtains 


(y(t)?) + = cs). (8-290) 


From the physical point of view, the Brownian particle has to arrive at 
an equilibrium state (regardless of its initial velocity) compatible with the 
equilibrium of the liquid bath, which implies that its mean kinetic energy 
(recall that we are considering only one degree of freedom for the sake of 
simplicity) has to tend to skpT (equipartition). In other words, one has to 


have the consistency condition 
B= ykgT, (8-291) 


satisfied between the random and the systematic parts of the fluctuating 
force, expressing the fact that this force arises in virtue of the equilibrium 
fluctuations of the liquid bath at temperature T. This then constitutes the 
statement of the FDT in the present context of the Brownian particle as 


described by (8-286). 


Referring to (8-287), the formula (8-291) can be expressed in terms of the 


integrated time correlation of the random force as 


i 


=, |, dt(R(to)R(to + t)), (8-292) 


“y 


(check this out; make use of the symmetry property of the delta function: 
6(t) = 6(-t)). This constitutes an alternative expression of the FDT. As 


we we will see, this can be generalized further to motions of Brownian 
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particles under less restrictive conditions, where one obtains a relation 
between the response of the particle to time-dependent external forces 
and certain equilibrium time correlation functions. It may be mentioned 
here that the formula (8-292) remains valid even in the presence of a 
force field (refer to [83]), though our derivation here is restricted to the 
case F' = 0. As we see below, (8-286) is a special instance of Langevin 
equations of more general form. Among the latter, there exists a class 
of Langevin equations that can fruitfully be analyzed by formulating the 
corresponding Fokker-Planck equations. The Fokker-Planck equations 
constitute a powerful tool for the analysis of stochastic processes since 


these tell us how the relevant probability distributions evolve in time. 


1. An alternative formulation of the FDT is the Einstein formula 
, (8-293) 


arrived at by considering the diffusion of the Brownian particles in 
the liquid in equilibrium at temperature T. The diffusion process is 
the manifestation of the migration of the Brownian particles when ob- 
served over a long time scale, over which the irregular part of the mo- 
tion gets averaged out. If C(x) denotes the concentration of the Brow- 
nian particles (we continue to focus on the projection of the motion on 
the x-axis), then the diffusion current is jgi¢ = -D2z, where D stands 
for the diffusion coefficient. Assuming that the diffusion takes place 
under the action of an external force F(x) = — 2) derivable from 
a potential V(x), the equilibrium distribution of the concentration is 
obtained from the condition that, in the long run, the total current 
Jtot = Jatt + Jarit, (this is a macroscopically defined quantity) has to be 
O(a) OV (a) 


zero where the drift current is given by jaig, = vC(x) = a ae 
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Since the equilibrium concentration C(x) has to be proportional to 
V(x) 
e ®sT in accordance with the Gibbs canonical distribution, the rela- 


tion (8-293) follows. 


2. In the microscopic description, the diffusion constant D is related to 


the mean squared displacement of the Brownian particle as 
(x?), (8-294a) 


as can be seen by making use of the Diffusion equation 


OC Orc 

— = D— -294b 

Ot Ox?’ 6 ) 
and noting that (x?) is related to the concentration (a macroscopically 


defined quantity) as 
Ge?) = [tcae. (8-294c) 


One then has, 0;(2?) = Df x? FE de. On integrating the right hand 
side by parts twice in succession, and making use of the appropriate 


boundary conditions (both C' and its spatial derivative vanish at « > 


+oo) and the normalization condition { C(x)dz = 1, it is seen to reduce 


to 2D. 


The limit t — co is necessary since the description in terms of diffusion 


holds only in the long run. 


3. Finally, the mean squared displacement (x?(t)) can be related to the 


integrated velocity time correlation function by noting that 


t 
x(t) = i; u(t)dt, (8-295a) 
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which gives 

t 
—(e") = 2 | dt’ (v(0)u(t’)). (8-295b) 
(check this out). Going over to the long time limit, one obtains 


pe i * dtlu(to)v(to +), (8-2950) 
0 


Another alternative formulation of the FDT in the case of the Brownian 
particle in a liquid bath (resulting from (8-293) and (8-295c) in the notes 
above, where the Brownian motion is considered in the context of steady 


diffusion) is 


oa = | * GAGs eB). (8-296) 


Y 0 


where the constant y, featuring in the Langevin equation (8-286), stands 
for the resistive force per unit velocity in the liquid. The drift velocity 
C= f can also be expressed as v = pF where yp stands for the mobility 
of the Brownian particle, which is a measure of its response to the force, 
the latter being a time-independent one in the present context. Thus, in 
this special case of a constant external force, one obtains the following 


alternative form of the FDT, relating the coefficient of response (i.e., the 


mobility) and the velocity-velocity time correlation function: 


oe) 


b= af dt(v(to)v(to + t)). (8-297) 


The time ¢t) in eq. (8-297) (as also in the preceding equations) can be 
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chosen arbitrarily since the stochastic processes under consideration are 


all stationary ones (refer to (8-287)). 


Before we proceed, let us pause to recall what the various different forms 
of the FDT tell us about a Brownian particle (or a collection of indepen- 
dently moving Brownian particles) and how we expect the statement of 
FDT to be modified as we consider less restrictive forms of the Langevin 


equation as compared to (8-286), (8-287). 


The first thing to mention is that the FDT represents an intrinsic fea- 
ture of a subsystem S in interaction with a bath in equilibrium, and is a 
manifestation of how the equilibrium fluctuations in the bath determine 
the resistive force arising from the bath fluctuations, where an important 
qualification is that the subsystem (the Brownian particle in the present 
context) itself is to be close to the equilibrium configuration compatible 
with the bath. It is this last requirement that prompts us to assume 
that the resistive force is linearly related to the velocity, though the form 
assumed in (8-286) can be generalized to include memory effects as in 


sec. 8.7.1.4 below. 


With this understanding in place, formula (8-291) may be taken to be the 
basic statement of FDT where the coefficient 7 characterizing the resis- 
tive force (assumed to involve no memory effect) is related to the strength 
of correlation of the random force characterizing the equilibrium fluc- 
tuations in the liquid. The alternative formulation (8-293) involves the 
diffusion constant D, characterizing the transport process of diffusion 


resulting from the systematic part of the motion of the Brownian particle. 
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The FDT itself does not depend on the external force F(t), though the 
solution of the Langevin equation, i.e., the time dependence of v(t) is cer- 
tainly affected by it. The basic idea once again is the same as in the more 
general linear response theory (of which the Langevin theory constitutes 
a special instance), where one expects that non-equilibrium time evolu- 
tion has to obey the constraints imposed by the equilibrium fluctuations 
in the bath. It is this feature that allowed us to derive (8-291) under 
the assumption F' = 0, while (8-293) was derived under the assumption 
that the drift current generated by the external force just balanced the 


diffusion current. 


Instead of a time dependent external force one can consider a general- 
ization of (8-286) where the force is x-dependent as in the case of the 
Brownian motion of a harmonic oscillator, while more complex nonlinear 
forces may also be considered. The harmonic oscillator problem consti- 
tutes a particular instance of linear Langevin equations in more than one 
variables. Assuming that the random force (the ‘noise’) is still in the na- 
ture of a white noise (as described by (8-287)) and, consequently, there 
is no memory effect in the resistive force, one can generalize (8-291) to 
a form that has several applications to systems of practical interest (re- 
fer to [150]). A non-linear external force F(x) requires special methods 
to be adopted in the solution of the Langevin equation, a particularly 
useful approach being the one of setting up the corresponding Fokker- 
Planck equation (see sec. 8.7.1.3 below). I repeat, however, that a time- 
dependent or a nonlinear force term in the Langevin equation does not 


alter the content of the fluctuation-dissipation theorem, though it affects 
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the time course of development of the relevant observables, one com- 
monly occurring consequence being an oscillatory time dependence of 
these, along with a damped one. Time dependence of a complex nature 
may also arise in the case of Langevin equations with ‘memory’, where the 
random force is not of the simple type given by (8-287) and the resistive 


force depends on past states of motion of the Brownian particle. 


I repeat that the Langevin equations or the Fokker Planck equations are 
of use not only for describing the motion of Brownian particles but are 
applicable for systems of more general types, namely, ones whose time 
evolution typically involves a random and a systematic part. As men- 
tioned earlier, such a time evolution occurs for a system S (the ‘subsys- 
tem’) in interaction with a larger system B (the ‘bath’) in equilibrium. In 
the present context, there is an implied assumption that S, in its time 
course of evolution, is close to the equilibrium configuration compatible 
with that of B. This is the constraint under which the FDT holds. As- 
suming for the sake of generality that S is additionally acted upon by 
an external force, the FDT constrains the time course of development of 
mean values of observables of S by relating these to equilibrium fluctu- 
ations expressed in terms of time correlation functions. In other words 
the statistical mechanics of subsystems interacting with an equilibrium 
bath conforms to the general scheme of things in linear response theory 


outlined in (8.4). 
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8.7.1.3. The Langevin equation and the Fokker-Planck equation 


The Fokker-Planck equation describes the time evolution of the noise- 
averaged probability density of one or more random variables, given the 
dynamical equations for those variables in terms of time dependent ran- 


dom ‘forces’, the latter being in the nature of Gaussian white noise. 


Thus, consider a set of random variables wy, u2,--- ,u, where the set may 
be made up of just one single variable (corresponding to n = 1), with 


dynamical equations of the form 


Here g; (¢ = 1,2,--- ,n) is some specified function of wu, u2,--- ,un, and the 
R;s are random ‘forces’ constituting Gaussian white noise as in (8-287), 


LsGus 


The functions g; (i = 1,2,--- ,n) may depend non-linearly on wy, u2,--- , Un. 
This feature is a generalization over the linear Langevin equation (8-286), 
and makes the Fokker-Planck equation a useful tool for describing a 


Brownian particle under a non-linear force field (see below). 


In the following we denote the variables wy, u2,--- ,u, collectively by the 
vector u, while similarly g(u) will denote the set of function g1, 92,--- , gn 


and R(t) the set of random forces R), R2,---R,. The distribution func- 
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tion f(u,t) is defined such that f(u,t)d'"u gives the probability of the 
variables u; to lie within u; to u; + du; at time t (@ = 1,2,---n; the initial 
condition on the variables is left unspecified) for any particular realiza- 
tion of the random forces R,, R2,---R,. Since the total probability satisfies 


f d™uf(u,t) = 1 at all times, f(u,t) has to satisfy an equation of continuity 


in u-space, i.e., the space of the variables w1, t2,--- , Un: 
Of C2, 
— . =— -2 
DE “fe > Du, (at) =0, (8-299a) 


where one has to substitute for w; the right hand side of (8-298a) for the 
specified realization of the random forces under consideration. This gives, 


for the specified realization of R(t), 


or = 2 [g(u)f + R(t) f}. (8-299b) 
Since this equation is written for some specified realization of R(t), it is 
a deterministic partial differential equation, analogous to the Liouville 
equation for a classical dynamical system that describes the evolution 
of the distribution function in the phase space, starting from any given 
initial distribution. Recall that a distribution function describes a mixed 
state of the dynamical system where the latter evolves in time according 
to the Hamiltonian equation of motion for the system. In the present con- 


text, u1,U2,--: ,U, are analogous to the phase space variables and equa- 


tions (8-298a) to the Hamiltonian equations of motion. 


We now take into account the fact the forces R(t) actually constitute a 


Gaussian stochastic process satifying (8-298b) and look at the distribu- 
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tion function averaged over the probability distribution of R(t) (termed 


the ‘noise averaged’ distribution function) 


P(u,t) = (f(u,t))r. (8-300) 


In the following the sub-index R will be left implied while indicating an 
average over the noise, and will refer to P(u,t) as the ‘probability distri- 
bution’ or the ‘distribution function’ without explicitly mentioning that it 


is the noise averaged function that is being referred to. 


Arriving at the noise-averaged version of (8-299b) is non-trivial, especially 
because of the fact that f(u,t) depends on R(t’) at earlier times t’ < t. I 
skip the details (refer to [150]; as mentioned above, the derivation makes 
use of the fact that R(t) represents Gaussian white noise) and state the 


final result - the Fokker-Planck equation resulting from (8-298a), (8-298b): 


) 
HP (ust) =. (gi(u Dy Fa Bog eP u,t)). (8-301) 


The first term on the right is obtained directly from the corresponding 
term in (8-299b) by averaging over noise. The second term, on the other 
hand, is obtained by integration over the history of noise and making use 
of (8-298b). It may be noted that the Fokker-Planck equation (8-301) 
has been written without reference to a fluctuation-dissipation theorem 
since the context in which it has been arrived at is too general for that 
(recall that the physical nature of the variables wu, u2,--- ,u, has not been 
specified; however, the FDT holds in the case of the harmonic oscillator 


outlined below). 
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In the particular instance of the Brownian motion of the harmonic oscil- 


lator, one starts with the Langevin equation 
mb = —yv — mw*2 + R(t), (8-302a) 


where one has a z-dependent external force (instead of the time depen- 
dent force F(t)) in (8-286), while the random force R(t) is assumed to 
satisfy (8-287) and the Langevin equation is still a linear one. One can 
write this in a form analogous to (8-298a) by introducing the vector u 


with wu, = 2,u2 = v and the matrix 
AS (8-302b) 
Equation (8-302a) then appears in the form 


u=Au+ R(t), (8-302c) 


where the components of the vector R(t) are R,(t) = 0, Ro(t) = + R(t), and 


the second equality in (8-287) can be expressed in the form 
0 
(R(t)R(t’)) = d(t—t’). (8-302d) 
0 
We will first check if the FDT can be arrived at as in the case of (8-286). 


Recall that the FDT could not be formulated for (8-298a) since it rep- 


resented too general a situation, where the random variables u; (i = 
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1,2,---) were introduced formally, and not from the physical point of 


view. 


For any specified realization of the random process R(t), the solution 
to (8-302c) is 
t / 
u(t) = e4u(0) + i dt'e’ AR (Lt), (8-302e) 


0 


where e’4 is obtained by diagonalizing the matrix A, whose eigenvalues 


are A, \*, with 
eee ead a 
A= —-—— +24/ Ww? — —. (8-302f) 
m 


Making use of these results one obtains from (8-302c), the solution of u(t) 


of the form 
1 t 
v(t) = v(t) + al dt’q(t —t’) R(t’), (8-303a) 
0 


where v(t) stands for the transient part of v(t) that goes to zero for t > oo 
(one can work out the expression for v(t) in terms of x(0),v(0) as for the 
damped harmonic oscillator, but this will not be needed for our present 


purpose) and 


n=") as (8-303b) 
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On squaring and averaging over noise, one obtains, for t > oo, 


(v(o0)?) = = / at! | ” atig(t —t)q(t — te) (R(t) R(t"). (8-304) 


m 


On invoking the second relation in (8-287) (or, equivalently, (8-302d)), 
making use of the delta function in performing the integration over t”, 
and then making a change of the integration variable, we obtain 


iu(oo)) = 22 ii * g(r)Par, (8-305a) 


m? Jo 
where q(7) (refer to (8-303b)) is given by 


q(T) = 00" — d*e*"), (8-305b) 


(check this out). The integration in (8-305a) can now be carried out by 
making use of (8-302f). On imposing the constraint that at t > oo the 
harmonic oscillator has to come to an equilibrium configuration that is 


compatible with that of the bath at temperature T, one obtains 


(v(oo)?) = —— = —, (8-305c) 


B= kT. (8-305d) 


This is the same relation as obtained for the Brownian particle in the 
absence of an external force, as it should be since, as mentioned earlier, 


the FDT is an intrinsic feature of a subsystem in interaction with a bath 
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at equilibrium, provided that the subsystem itself is in the linear regime, 


close to equilibrium. 


Referring to the general form of the Fokker-Planck equation (8-301) and 
invoking the FDT, one can now write down the Fokker-Planck equation 
for the noise-averaged distribution function P(z,v,t) of the damped har- 
monic oscillator as 


Oe Oa ce 


a on —. (8-306) 


(check this out). This can be generalized to the case of a Brownian particle 


in a force field derived from a potential V(x) by replacing wz in the second 


term on the right hand side with 1%: 
OP 0 O yu 10V ykpT 0°P 
Se Sy es ee P)4 -307 
Ot on ) ap | mom aa? ) ne On eaua 


In the absence of dissipation (y = 0), tha Fokker Planck equation (8-307) 
(with the velocity v replaced with 7) reduces to the Liouville equation 
describing the evolution of the probability density in the phase space for 


a Hamiltonian 
p 


as it should (reason out why), and the steady state equilibrium solution 


P.,(a,p), satisfying ore = 0 reduces to the canonical distribution in the 
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phase space: 
ree ae Pt, Z = [ eedpe°™, (8-309) 


(check this out). In this case, the Fokker-Planck equation tells us that 
an initial distribution in the phase space evolves in time so as to tend to 
P.q in the limit t + oo. In other words, the Fokker-Planck equation in 
this case is a way to describe the ‘coarse-graining’ necessary to generate 
the canonical distribution from the long-time behavior of the solution 
to the Liouville equation, starting from any given initial distribution in 
the phase space (barring a class of exceptional initial distributions). In 
general, however, a Fokker-Planck equation of the form (8-301) does not 
allow us to make specific statements about the existence and nature of 


steady state solutions in the limit t — oo. 


8.7.1.4 The generalized Langevin equation 


The formula (8-289), which follows from (8-288) was derived on the as- 
sumption that the random force constitutes a ‘white noise’ (refer to (8-287)). 
This is clearly a restrictive assumption whose validity requires that the 
time scale characterized by the molecular impacts be much smaller than 
that characterizing the motion of the Brownian particle. If, however, the 
latter is of a small mass so that its motion is sufficiently rapid, one will 
have to consider the detailed short-time correlations of the random force. 
The necessity also arises of assuming that the systematic part of the fluc- 
tuating force involves a ‘memory’ where the force at time ¢t depends on 


velocities at times prior to t. One is thereby led to consider a time de- 
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pendent friction 7(t) such that the Langevin equation now appears in the 


form 
mo(t) = — i dry(t — T)v(r) + F(t) + RU). (8-310a) 


The first term on the right hand side gives the resistive force at time t 
which now involves a summation over elementary contributions coming 
from earlier times 7 spread out from 7 + —co to r = t. The second term 
represents the external force as in (8-286), while third term represents 


the random force that is still required to satisfy 
(R(t)) = 0, (8-310b) 


(indeed, the mean of the fluctuating force represents the systematic resis- 
tive force and is included as such), though the second equality of (8-287) 
now needs to be modified, where we now make the minimal assumption 


that 
(u(to) R(t)) = 0 (to < t). (8-310c) 


This states that the random force at time t is not correlated with the 
velocity at earlier times. In addition, we assume that the random force is 
not correlated with the external force F’, since the latter is assumed to be 


a weak one so that only linear terms need to be taken into account. 


A stochastic process such as v(t) described by a stochastic differential 


equation involving memory effects is referred to as a ‘non-Markovian’ one, 
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in contrast to the one resulting from (8-286), which is of the Markovian 


type. 


In describing and analyzing stochastic processes with memory effects, 
it is often convenient to work with representations in the frequency do- 
main as in sec. 8.4.7. For a stationary stochastic process (i.e., one for 
which the origin of time can be chosen arbitrarily; equilibrium fluctua- 
tions are instances of stationary stochastic processes) u(t), defining the 


power spectrum G',(w) as 


Gu(w) = [ (u(to)u(to + the*dt, (8-31 1a) 


oe) 


where ft, is arbitrary, one can express the correlation between the Fourier 


components u(w) in terms of the power spectrum as 
(u(w)u(w’)) = 27G,(w)d(w + w"), (8-31 1b) 
(check this out). 


Considering, for instance, the stochastic process represented by the 
random force R(t) satisfying (8-287) (note that this is an instance of a 
Markovian process, in contrast to the more general types of processes 
under consideration in the present section), the FDT, expressed by 
the relation (8-291) can now be expressed in the frequency domain 


as 


(R(w)R(w’)) = 4rykpTo(w +"), (8-312a) 
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since the power spectrum Gp(w) is a constant 


Gr(w) = 2B = 2ykpT, (8-312b) 


(check this out). 


In order to describe the solution of (8-310a) in the frequency domain, it 
is convenient to consider an external force with a harmonic time depen- 


dence 


F(t) = Foe, (8-313) 


where it is understood that the real parts of complex expressions are to 
be taken so as to represent real-valued physical quantities in the the- 
ory. In virtue of the linearity of (8-310a), the response of the Brownian 
particle to a force with an arbitrarily defined time dependence (subect to 
appropriate boundary conditions) can be represented as a superposition 
of responses to harmonic forces of type (8-313). Accordingly, denoting the 
mobility in the frequency domain representation by ju(w), the response to 


the harmonic force introduced above can be expressed in the form 


(u(t)) = p(w) Foe ™. (8-314) 


Note that the response function p(w) is to be a noise-averaged quan- 
tity. In the thermodynamic interpretation, the averaging over noise 
stands for an average over equilibrium fluctuations. In other words, 


the response function is a macroscopically defined one. 
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On substitution in (8-310a), one obtains the mobility p(w), i-e., the re- 


sponse function in the frequency domain, as 


1 1 
a ad ERE (8-315a) 
where 7|w| stands for the Fourier-Laplace transform [83] of y(t) defined 


as 
yw] = [ ety (t)dt, (8-315b) 
0 


(check this out; after substituting in (8-310a) and making use of (8-313), 
you will need to redefine the variable of integration). One can express ju(w) 
in terms of the velocity correlation function by noting that the latter is an 
intrinsic quantity, defined by averaging over the noise (which stands for 
an average over the equilibrium fluctuations of the bath), and can hence 
be related to 7[w| by referring to the special case Fy = 0 for the sake of 


simplicity. This yields 


/ ” dte™* v(tp)ul(ty + t)) = eT (8-316) 


where to can be specified arbitrarily. 


Check the above formula out. Starting from (8-310a) with F = 0, obtain the 


CO Kiwt d 


expression for [, e’*’ 5 


‘ (v(to)v(to + t)) dt in terms of an integral involving the 


function y(t) by making use of (8-310c); in the resulting equation, perform 
an integration by parts on the left hand side, and then rearrange terms; 


you will have to make use of the fact that, because of the damping, an 
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expression of the form e’“‘v(t) goes to zero at t — ov). 


On making use of (8-315a), and of the fact that the equilibrium-averaged 


value of v? is “*, one finally obtains 
plc) = 8 f° dte™(u(to)ulto +1)), (8-317) 
0 


where t) can be chosen arbitrarily. 


We can now compare this with the general results on FDT arrived at in 
sec. 8.4, where we looked at the response, measured by the change in the 
non-equilibrium mean value of an observable B, caused by an external 
perturbation coupled to the system under consideration through an ob- 


servable A. The causal response function F’, ,(t) is given by (see (8- 130b)) 
a(t) = B(A(to)B(to + #)) (t > 0), (8-318) 


in which A(t )) stands for the time derivative of A in the Heisenberg type 
description, evaluated at tj, where t) can be chosen arbitrarily since the 
equilibrium fluctuations are invariant against time translation. The dy- 
namic susceptibility yg4(w), as obtained from (8-157), (8-188), and from 


the fact that FR’, ,(t) is defined only for t > 0, then appears as 
XBa(w) = af dte™' (A(to) B(t)). (8-319) 
0 


It remains to identify, for the particular case of the Brownian particle, 
Xea(w) with the mobility y(w) while, at the same time, identifying B with 


the velocity v and A with the position co-ordinate xz, since the effect of 
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the force F' is the same as that of a term —xzF in the Hamiltonian of 
the particle (the resistive force and the random force are assumed to 
result from the equilibrium fluctuations of the bath). Noting that z = v, 
one observes that (8-317) constitutes a special instance of (8-319). In 
other words, the generalized Langevin equation (8-310a) conforms to the 
general expression of the FDT for a Brownian particle consistent with 
the formulation in sec. 8.4 for the linear response of a system close to 


equilibrium. 


An alternative expression for the FDT in the context of the generalized 
Langevin equation is given by (see [83]) 


1 [oe] 


es dwt : 
= pop |, (RlO)RW)eMat, (8-320) 


7 [w] 


The Langevin equation can also be set up for a quantum mechanical 
Brownian particle, and the above results can all be generalized to the 
quantum context by replacing the classical correlations with the corre- 
sponding Kubo correlation functions, as explained in sec. 8.4.7.4 (refer 


to [83]). 


8.7.2 The langevin equation: reduction from an oscilla- 


tor bath 
8.7.2.1 The idea of reduction 


I will once again briefly outline the basic idea underlying the approach 


adopted in non-equilibrium statistical mechanics, commonly referred to 
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as reduction or ‘contraction’. This approach, which is really ubiquitous in 
statistical mechanics, works in either of two broadly distinct but related 


areas of inquiry or, put differently, two overlapping scenarios. 


First, one has a situation analogous to the Brownian particle in a liq- 
uid bath where, in describing the motion of the particle, one blots out 
the bath variables and replaces those effectively with a dissipative force 
and a random one, the two carrying complementary information about 
how the bath affects the particle. The bath is assumed to be a large 
system that continues in a state of equilibrium at temperature T. This 
involves an implied assumption that the time scale for the relaxation of 
the bath variables is small compared to the time scale characterizing the 
fluctuation-averaged motion of the particle, in which case the resulting 
Langevin equation is of the Markovian type. This assumption about time 
scales can be relaxed, when one ends up with a generalized Langevin 
equation of the form (8-310a) where the reduction is less complete since 
the power spectrum of the random force R(t) and the memory kernel 7(7) 
remain unspecified, though the Fluctuation-dissipation theorem contin- 


ues to be a constant fixture in the theory. 


In sections 8.7.2.2 and 8.7.2.3 below we look at the problem of how a 
particle (the ‘system’ S) interacting with a system of oscillators consti- 
tuting an idealized bath (B) can, by a process of reduction, be shown to 
be described by a Langevin equation, where the emphasis is on looking 
for a model with an exact solution. This is a classic problem in non- 
equilibrium statistical mechanics that not only serves as a paradigm for 


other, more complex, problems but is relevant in a large class of real-life 


1156 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


problems as well. It has been investigated by means of a wide array of 
methods, each of considerable merit in itself as one opening the door to 
the investigation of a more or less broad class of problems of practical 


interest. 


As mentioned earlier, the Langevin approach can be generalized, within 
limits, to a system S in interaction with a larger system B, referred to as 
the bath, where the bath is in a state of equilibrium and the system is 
close to being in equilibrium with the bath. In the context of the compos- 
ite system made up of B and S, the latter is referred to as a subsystem, 
but it is also a ‘system’ in its own right, under the influence of another, 


so it can be addressed either way so long as there is no confusion. 


Other than a Langevin type description, another widely used approach of 
reduction is by way of invoking a master equation. I will give a sketchy 


idea of what this involves in sec. 8.7.3 below. 


The second category of problems in the class that can broadly be identi- 
fied as making use of a reduced description includes those that are ad- 
dressed by what is referred to as the the projection operator formalism or 
the Nakajima-Mori-Zwanzig theory. Here one does not have a separation 
between a bath B and a system S, but a single system S in which one 
makes a distinction between ‘relevant’ and ‘irrelevant’ observables, and 
then blots out explicit reference to the latter by a process of projection. In 
order to be effective, the class of the relevant variables is to be small and 
slow (in some appropriate sense) as compared to that of the irrelevant 


ones which are necessarily numerous and comparatively fast. Moreover, 
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the system has to be close to an equilibrium state for the method to work 


well. We will be having a brief glance at this method in sec. 8.7.4 below. 


I close this section with a few more general comments on the approach 
of reduction or contraction, so widely followed in statistical mechanics, 
where one settles for a partial description of a system, blotting out cer- 
tain inessential or irrelevant details — irrelevant, that is, in some specific 
context where the means of observation and the level of description of 
the system are insensitive to those details. The details that are blot- 
ted out may refer to a ‘bath’ external to the system under consideration 
or, without reference to external systems, may belong to the system it- 
self, depending on the time scales of variation of the various observables 


characterizing its state. 


As regards the mode of description of the system, one may refer to the 
evolution of state variables or observables characterizing it, or to proba- 
bility distributions over these observables. This is the distinction in terms 
of pure states and mixed states we are familiar with, and does not refer 
directly to the degree of completeness of the description. Thus, consid- 
ering a closed system, the Hamilton equations describe the evolution of 
pure states, while the Liouville equation gives the same description in 


terms of mixed states, and none of these involves a contraction. 


When a contracted description is attempted, the effect of blotting out 
the irrelevant variables is felt upon the evolution of the remaining vari- 
ables in two major ways: first, one needs a probabilistic description for 


these remaining variables, and secondly, the description becomes non- 
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Markovian, involving memory effects. Referring once again to a single 
particle interacting with a bath (an instance of an open system) as in the 
case of Brownian motion, both of these effects are seen to arise in virtue 
of blotting out the bath variables: when the evolution of a pure state is 
considered, one has to introduce a random ‘force’ term in the equation of 
motion, and the resulting equation (the Generalized Langevin equation, 
refer to sec. 8.7.1.4, and to sections 8.7.2.2, 8.7.2.3 below) involves, in 
general, a memory term. In the special situation in which the memory is 
retained over only an infinitesimally small time scale, the Fokker-Planck 
equation of sec. 8.7.1.3 provides an alternative description, now in terms 


of the evolution of mixed states. 


Generally speaking, the memory of past states of the retained system re- 
mains effective over a time scale that depends on the nature of the vari- 
ables blotted out. The greater the extent of contraction, the longer is the 
time over which the memory of past states is retained. However, all these 
generalities acquire meaning only in the context of concrete methods of 
contraction adopted for concrete models. The reduction from an oscilla- 
tor bath, discussed in sections 8.7.2.2 and 8.7.2.3 below constitutes an 


instance. 


The basic ideas underlying the approach of reduction or contraction, 
more generally referred to as that of ‘projection’, are explained in [84], sec- 
tion 2.5, which contains the general mathematical framework in which a 
wide variety of methods and models can be included as special instances. 
One finds that the distinction between the first and the second category 


of problems mentioned at the beginning of this section, addressed under 
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the general heading of reduction is, to some extent, arbitrary, since both 
can be looked at from the broader point of view based on the projection 


operator formalism. 


A large class of problems in statistical mechanics deal with systems inter- 
acting with external fields rather than baths, the ones considered in the 
linear response theory (sec. 8.4) being of this type. As seen there, the time 
evolution of the system under consideration essentially depends on the 
relevant equilibrium time correlation functions. In the case of a quan- 
tum mechanical system one requires the application of Fermi’s golden 
rule — a widely used result based on second order perturbation theory. 
A commonly adopted approach (see sec. 8.7.3) is to make use of a master 
equation that gives the transition rates between unperturbed stationary 
states (i.e. those in the absence of the external field). This can be gener- 
alized to include the case of a system interacting with a heat bath rather 
than a field. The basic object of relevance in the description of quantum 
mechanical evolution is the density matrix of the system. The evolution of 
the density matrix can be described at various levels of complexity. Thus, 
one can look at the rates of change of the diagonal elements (i.e., the 
probabilities of the unperturbed states) under the action of a field or of a 
bath, without attending to the changes occurring in the off-diagonal ele- 
ments, where once again the second order perturbation theory is invoked 
and results corresponding to those obtained in sec. 8.7.2.3 are arrived 
at. As mentioned earlier, analogous results are obtained by the projec- 
tion operator method as well. The full dynamics of the density matrix, 


including the changes in the off-diagonal elements can also be analyzed 
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by the method of reduction from the composite density matrix including 
the states of the bath to which the system is coupled, from which one 
obtains a description of the process of decoherence in addition that of the 
process of relaxation to the state of equilibrium (see sec. 8.7.3 for further 


details). 


8.7.2.2 Reduction from oscillator bath: classical 


We start with the Hamiltonian for a single particle (the ‘system’) with 
momentum p, position co-ordinate x, and mass m, moving in a potential 


V(a), written as 
Ae Vie). (8-32 1a) 
mm 


The particle is assumed to interact with a ‘bath’ made up of a large num- 


ber (V) of independent harmonic oscillators described by the Hamiltonian 


N 
Ap = > ( Pi + = Hiued: (8-32 1b) 
=i 


Here m;,w; stand for the mass and frequency of the ith oscillator (i = 
1,2,--- ,N), characterized by the position variable g; and momentum vari- 


able p;. Finally, the interaction Hamiltonian is written as 


Len. @ 
H=S~ gigie 4 [5 al”, (8-32.10) 


where the first term on the right represents a bilinear coupling of the 
particle with each of the bath oscillators, the coupling constant with the 


ith oscillator (i = 1,2,--- , N) being g;. The second term on the right does 
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not represent a genuine interaction and can be included into Hg since 
it does not involve the bath variables, while depending on the constants 
gi, Mi, w; (i = 1,2,--- , N). As a consequence of the inclusion of this term, 


the global minimum of the total Hamiltonian 


H = Hs+Hp +H, (8-321d) 


is found to be at & = 0,p=0,4; =0,p; = 0 (i= 1,2,---, N). 


1. One can, in principle, consider a number of particles making up our 
system of interest, interacting with the oscillator bath. The latter can 
be made up of a large number of particles interacting with one an- 
other with a two-body force of some chosen form. However, the more 
the generalization, the less the resulting model will lend itself to an 
exact solution, unless the generalization happens to be of a trivial na- 
ture. Exact solutions are important in that these tell us in clear terms 
whether the simplifying assumptions going into the construction of 


these do produce the type of behavior that one observes in real life. 


2. As we will see, in order to arrive at results that represent the 
principal features of observed Brownian motion, i.e., the ones 
obtained from the Langevin equation (8-310a) (or even from the 
simpler (8-286)), one has to make certain assumptions regarding 
the oscillator bath in addition to the ones going into (8-321b), 
(8-32 1c). It does not do to simply assume that the number (JV) 
of oscillators in the bath be large — one additionally needs to 
assume that the number goes to infinity, with some appropriate 


distribution of the frequencies of the oscillators. 


With the Hamiltonian (8-321d) in place, one can set up the equations of 


motion of the composite system made up of the particle and the bath, 
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which appear as 


[particle :] ¢ = Z 
m 
a ‘ 
ee a Ss oe Fi -399 
— d. 9:95 d. ra (8-322a) 
[bath :] g, = 2, 
™y 


[bath :] g; = —w2q; — oe (é =1,2,--- ,N), (8-323a) 
and 
: 1vVv 1c 1. ee Ge 
[particle :] # = a Ss Ht — Te, (8-323b) 


These equations can now be dealt with in turn. Observe that, given an 
assumed solution « = z(t) of (8-323b) (without regard to how this so- 
lution is arrived at; that will be our concern at the next stage of the 
analysis), (8-323a) are a set of inhomogeneous linear differential equa- 
tions, of which the formal solutions (formal, because we have only an 


assumed solution x(t) at this stage), subject to given initial conditions 
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di = 4:(0), p; = p;(0) (i = 1,2,--- , N) can be written as 


; t 
qi(t) = g;(0) cos wt + : pi (0) sin wit — a | drsinw;(t—T)a(r), (8-324) 


14 MNjiWj 0 


(check this out by substitution in (8-323a); in order to see how this so- 
lution is arrived at, one can make use of the basics of Laplace transfor- 
mation — refer, for instance, to [140]). With this expression substituted 
in (8-323b), one obtains the equation of motion for x(t) in a form in which 


the bath co-ordinates are eliminated: 


toy. i 1 af 
t= i an D gi|qi(O) cos wt + mo sin wit — ae / dr sinw;(t — 7)2(7)| 
N 
1 2 
>> Ii st. (8-325a) 
m MW; 


We perform an integration by parts on the term appearing as an integral 


and re-arrange terms as 


(er) ae ee 1 
ae i[ Vi i i(0) sin w; ie 
z oe m 24 [gi(0) cos wt + a (0) sin w,;t + aie | 
Lge : 
ee i dr cosw;(t — T)a#(T). (8-325b) 
mmw? Jo 


This is an integro-differential equation analogous in form to the general- 


ized Langevin equation (8-310a): 


mv =— - dryg— Tor) +f (2) + Ag), (8-326a) 
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where the external force is constant in time but has a x-dependence, 


fa (8-326b) 


(7) = S- a cos(w;T). (8-326c) 


Finally, the expression for R(t) is seen to be 


sin w,;t + (qi(0) + Ji 


a a 


a(0)) cos wt], (8-326d) 


RW) =~ 92 


7 
though it is not yet shown to be of the nature of a random force. 


1. The first term on the right hand side of (8-326a) differs from the cor- 
responding term in (8-310a) in that, in arriving at the former equation 
we have chosen the initial time at ¢ = 0, instead of t = —oo; the latter 
choice would correspond to a different value of the time at which the 
initial bath and particle co-ordinates are evaluated; since the bath co- 
ordinates will be chosen below in accordance with a canonical distri- 
bution of the bath states, this just leaves an arbitrariness in choosing 
the initial value of the particle co-ordinate, without altering the con- 
tent of (8-326a); since we will be concerned with expectation values in 
accordance with the equilibrium bath distribution, physical results of 


the theory will be invariant against a time translation. 


2. Observe that the third term on the right hand side of (8-323b), arising 
from the second term on the right hand side of (8-321c) (referred to as 


a ‘counter-term), gets canceled in the final equation (8-325b). 


On looking at the Langevin-like equation (8-326a) we first inquire as to 
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how far the third term on the right hand side can be interpreted as a 
random force. On the face of it, this term is a deterministic one since it is 
a sum of well-behaved oscillatory terms. However, for a large number of 
oscillators in the bath, the sum, as a function of time, can indeed approx- 
imate a stochastic process. This can happen, for instance, if the frequen- 
cies of the bath oscillators are uncorrelated with one another. Though 
the sum is recurrent in time for arbitrarily large but finite V, the distinc- 
tion from a truly random process is removed if one introduces a coarse- 
graining by assuming that the initial values of the position co-ordinates 
and momenta of the bath oscillators are determined in accordance with 
some appropriate probability distribution. Accordingly, we assume that 
the distribution of these variables is a canonical one corresponding to 


some specified temperature, say, 7’. 


In other words, we invoke the thermodynamic limit in describing the 
bath, which is made up of an infinitely large number of oscillators with 
uncorrelated frequencies and is initially in equilibrium at a temperature 
T. The initial distribution will shortly be made more precise, but before 
that we look at the first term on the right hand side of (8-326a) and see 
what kind of a memory kernel y(7) it can possibly lead to. It is apparent 
that 7(7) can indeed assume a wide range of forms, depending on the dis- 
tribution of the frequencies of the bath oscillators as also of the coupling 
strengths g;. Going over to the thermodynamic limit, one can assume a 
continuously distributed frequencies w and a distribution of the coupling 
strengths in the form of some appropriate function of w. Assuming that 


the number of frequencies falling within a small range dw centered around 
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w is o(w)d(w), and that the scaled coupling strengths v9 are distributed 


in the form of a function g(w), one obtains 


(Tr) = a diva (wy cos wt. (8-327) 
0 


Different forms of the friction kernel y(7) result from various choices of 
o(w) and g(w). For instance, if o(w) is proportional to w? and g(w) is a 
constant, then the memory kernel works out to a delta function and one 


gets a Markovian Langevin equation. 


Turning now to the force function R(t), we recall from above that it repre- 
sents a stochastic process once we introduce a coarse-graining such as 
the one through the initial condition on the bath oscillators. A convenient 
prescription for the coarse-graining is formulated by referring to the bath 
Hamiltonian Hg taken along with the interaction Hamiltonian H;, which 


can be written as 


Pee, 
Hp = Hp + Hi= >) [57 + smi Qi], (8-328a) 
where 
P= pi, Qi= q+ oe. (8-328b) 


7 


Thus, for any given value of the particle co-ordinate x, Hp; represents 
a set of harmonic oscillators, the position co-ordinate of the oscillators 
being related to those of the bath oscillators by fixed displacements. We 


now assume that the system of these oscillators is in thermal equilibrium 
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at the initial time t = 0, as a consequence of which R(t) given by (8-326d), 


assumes the form 


R(t) = - oy Gi Po) sin wt + Q;(0) cos wit]. (8-329) 


It is now straightforward to check that, with respect to the distribution of 


the initial conditions Q;(0), P;(0) (i = 1,2,---) mentioned above, 


(P;(0)) = (pi(0)) =0, (Qi(0)) = (a:(0) + “+ x(0)) = 0, (8-330a) 


(p;(0)*) = mike, ((q:(0) + —52(0))*) = 25, (8-330b) 


(check this out; the relations (8-330b) are simply the equipartition for- 
mulae of each of the oscillators in Hp;). The time correlation of R(t) is 
then seen to be 


(RO)R(t)) = > ke coswi(t — t'), (8-331) 


—~ MW, 
a 


a 


(check this out as well). Comparing with (8-326c) one arrives at the for- 


mula 


(RE)RE)) = keT y(t — €), (8-332) 


which expresses the Fluctuation-dissipation theorem in the present con- 


text. This is consistent with (8-320) in view of the definition (8-315b). 
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In this instance of the oscillator bath, the effect of reduction is seen in the 
replacement of the infinite number of parameters m;,, w;, 9; by the memory 
function y(7) and the random force R(t). Once the reduction is made, 
one can make use of a kernel 7(7) and a random force R(t) phenomeno- 
logically so as to effectively describe the behavior of the system under 
consideration. Of course, the choice has to be compatible with the na- 


ture of the bath variables that get blotted out in the process. 


8.7.2.3 Reduction from oscillator bath: quantum mechanical 


The reduction from an oscillator heat bath can be done in the quan- 
tum mechanical context as well, where the derivation proceeds analo- 
gously [38]. The parallel between the classical and quantum mechanical 
cases is quite close since the quantum mechanical Hamiltonian does not 
involve products of non-commuting variables, and the Heisenberg equa- 
tions of motion (see below) are formally similar to the classical ones. In 
reality, though, the two problems differ in content in spite of their formal 


similarity. 


The quantum problem starts with the Hamiltonian operator of the com- 


posite system made up of the Brownian particle and the oscillator bath, 


written as 
é s A ; p ~ DB 1 2 2 
H = Hs+ Hp + Ih = (5— + V(a)) Ds Om, + gmt) 
N 1 N 2 
+ (> gdst +5) = 54”), (8-333) 


where the hats over symbols denote operators representing the respective 
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observables, and where there is a term by term correspondence with the 


N 9? 


classical Hamiltonian. Note that the counter-term (5 pare ae <7) has been 


a 


included in the interaction Hamiltonian so that it is only the potential 


V(«) that appears in the final reduced equation, with the effect of the 


counter-term getting canceled. 


The quantum context differs from the classical in that a classical Pois- 
son bracket gets replaced with + times the commutator of corresponding 
operators. In the case of the Hamiltonian H of (8-333) the resulting equa- 
tions of motion have an exact correspondence with the classical ones 


since it does not involve a product of non-commuting operators: 


[particle :] = £ 
m 
N N ge 
b= —V'(2) — > ei — D8, 8-334 
[bath ‘] di = a 
M4 
D; = —miw? Gi — Gk (i = 1,2,--- , N). (8-334b) 


Here and in the following we describe quantum mechanical time develop- 


ment in the Heisenberg picture, in which an operator A evolves in time 


as 


A= IA, Al. (8-335) 


The bath equations can be integrated, with 7, p treated as fixed parame- 
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ters, yielding 


t 
: p;(0) sin w,;t — as | drsinw;(t —7)#(7). (8-336a) 
Ww oJ 


where the last term on the right hand side can be integrated by parts to 


yield 
a / dr sinw;(t — T)£(7T) = Fe [z(t) — cos w,tz(0) — / dr cos w;(t — T)£(r7)). 


(8-336b) 


Substituting in the Heisenberg equation of motion for the particle and 


solving for the latter one obtains, in analogy with (8-325b), 
ee t . “Aw 
C= -{ dry(t — r)#(r) — V'(#) + R(O), (8-337a) 
0 


where the friction kernel 7(t) is again given by 


yr) => > F coswit, (8-337b) 


MW? 
i=1 9° 


while R(t), to be identified as the random force operator, appears in the 


form 
R(t) ao » Gi (20 sin w,t + Q;(0) cos wit], (8-337c) 

with 
P.=6, = 4+ ma zt. (8-337d) 
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Analogous to the classical case, the operators P,(t),Q;(t) (i = 1,2,--- ,.N) 
appear as the momentum and position operators of a collection of dis- 


placed harmonic oscillators described by the Hamiltonian 


A 


2 


. A i P.: 1 A 
Hy = Hp t+ M= 5 [ an t smi Qi], (8-338) 


7 


where the operators are in the Heisenberg representation, evaluated at 
t = 0 (ef. [38], which follows a slightly different procedure, while arriving 


at analogous results). 


One can go through the same arguments regarding the friction kernel 7(7) 
as in the classical case. In particular, in the thermodynamic limit (V —- 
oo), assuming that the frequencies of the bath oscillators are distributed 
in accordance with density o(w), and the scaled coupling constants are 
distributed as g(w), one obtains y(w) in the form (8-327), which reduces 


to a delta function if g(w) is a constant and o(w) varies as w”. 


Analogous to the classical case, the force R(t) describe an operator-valued 
stochastic process in the thermodynamic limit when the set of the dis- 
placed oscillators with position and momentum co-ordinates Q(t), P(t) de- 
fined above are assumed to obey a canonical distribution at initial time 
t = 0 corresponding to a temperature T, but now the operators R(t), R(t’) 
do not commute for t 4 ¢’. Thus, in the quantum context the equilib- 
rium expectation (R(t)R(t’)) is not the correct correlation function to fea- 


ture in the FDT. Instead one needs the Kubo correlation introduced in 


sec. (8.4.7.4). The Kubo correlation (refer back to sec. 8.4.7.4) between 
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R(t) and R(0) appears as 
ae if? ‘ . 
(RRO) = 5 | dd (R(—ihd) R(t), (8-339) 


where R(—ihA) stands for e”#R(0)e”"=! (recall that, by our choice of the 
initial ensemble, equilibrium averages are to be evaluated with reference 
to the Hamiltonian Ar (formula (8-338)) representing the collection of 
displaced oscillators). One can work out the right hand side of (8-339) by 
taking note of the following standard formulae for independent quantum 


oscillators in equilibrium at temperature T: 


(Q,(0)P,(0)) = Finds, (P,(0)Q;(0)) = — 54h, 
s h 7 Bhu; 


(Q;(0)Q;(0)) = hwy; 


-6;; coth 5 > (8-340) 


2 
(R(t)R(0))« = keT S- ar COS Ww;t, (8-34 1a) 


(check this out as well; will involve a bit of algebra). In other words, the 


FDT appears in the present context as 
(R(r)R(0))x = keT (7). (8-34 1b) 


Comparing with (8-332), one observes that the Kubo correlation is indeed 
the quantum generalization of classical correlations in describing non- 
equilibrium response of systems (of which the static response constitutes 


a special case, refer back to sec. 8.4.7.3). 
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In closing this section I have to point out that the generalized Langevin 
equation (8-337a) derived in the quantum context resembles the corre- 
sponding classical equation (8-326a) since the force function F(z) in the 
latter can be expresses as —V'(x), assuming that a potential V(X) exists, 
but the resemblance is only formal since the quantum equation involves 
operators and their time derivatives. When one takes expectation values, 
the two equations agree except in the force terms since, generally speak- 
ing, (V'(z)) does not equal V'((#)). This makes the quantum behavior dif- 
fer from the classical one, though both lead to a state of equilibrium with 
the bath in the long run, where the particle variables share equilibrium 
fluctuations of the bath. The case of a harmonic oscillator (V(#) = $mw?#”) 
interacting with a harmonic oscillator bath is exceptional where the (i(t)) 
and (i(t)) both follow the classical equation and an exact solution of the 
quantum problem can be constructed. The nonlinear force case an ex- 
act solution is thwarted by the fact that the nonlinearity affects both the 


noise term and the memory term (refer to [150], section 8.5). 


8.7.3 The master equation approach: a brief outline 


Consider a quantum mechanical system described by a Hamiltonian 


H=H)+V, (8-342) 


where V stands for a small perturbation over the unperturbed Hamilto- 
nian Ho. The perturbation may represent the effect of an external field 
weakly coupled to the system or can be a part of the Hamiltonian of the 


system itself (in case the latter is an isolated one) or can even describe its 


1174 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


interaction with a bath in some appropriate sense (for this case, see be- 
low). The eigenvalues and eigenvectors of Hj are assumed to be known. 
Assuming that the system is initially in a mixed state with some given 
probability distribution over the unperturbed states, one would like to 
know how the occupation probabilities evolve with time, and under what 
conditions these evolve towards the probabilities characterizing an equi- 
librium state, viz., those characterizing a canonical or microcanonical 
distribution. The evolution of probabilities towards an equilibrium distri- 


bution is referred to as a process of relaxation. 


The master equations essentially describe the changes in the probabilities 
of the unperturbed states with time, the simplest of these being in the 
form of rate equations. Thus, if the density matrix describing the system 
at time t be f(t), and its diagonal elements in the representation in which 
Ho is diagonal be ppp(t) (n = 1,2,---), then these denote the probabilities 


P,,(t) whose time evolution is described by the master equation. 


The formulation of the principles of quantum mechanics in terms of 
density matrices for simple and composite systems can be found in 
many standard text-books. The account given in [14] is useful in the 


context of equilibrium and non-equilibrium statistical mechanics. 
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8.7.3.1 The Pauli Master equation 


In the simplest case, rates of change of the probabilities P,,(t) (n = 1,2,---) 


are expressed in the form of a set of closed equations, of the form 


ae = Wasa >, Wisma): Ca), (8-343) 


where W,,,_,,, stands for the transition rate from the unperturbed level |m) 
with energy £,,, to the one (|n)) with energy E,,. These transition rates are 


now obtained from the Fermi golden rule as 


Winn = Wa-sm = |(m|¥/|n)[?6( En — Em) (8-344) 
(refer to [150] for an interpretation of this formula, including that of the 
delta function). The first equality is referred to as the principle of ‘mi- 
croscopic reversibility. Formula (8-344) is known as the Pauli master 
equation. The stationary solution corresponds to a microcanonical equi- 
librium: all levels with a given energy FE are equiprobable, with P = an 
where g stands for the degeneracy at energy F (the normalization con- 
dition in (8-343) effectively refers to the levels with a specified value of 
the energy). As things stand, this master equation does not say anything 
as to how the probabilities for levels with different energies are related 
— each set of degenerate levels retains the total probability that it had 
at t = 0 (this, indeed, is the characteristic of the microcanonical distri- 
bution). This master equation does not explicitly involve any reduction 


since transitions between levels with different energies are left out of con- 


sideration. 


1176 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


From an examination of the nature of the matrix with elements W,,,,, = 
Wm—s+n, one can make statements about the way an initial probability 
distribution evolves over time; the stationary solution corresponds to 
an eigenvector of the matrix corresponding to eigenvalue zero (such 
an eigenvalue always exists); generally speaking, an arbitrarily cho- 
sen initial distribution asymptotically approaches this state at large 
times, i.e., the stationary solution actually describes the equilibrium 


state. 


8.7.3.2 Heat bath master equation 


We now consider the problem of a system (S) coupled to a heat bath (B) 
(refer to sec. 8.7.2.3) where the theory is more complex by one notch, 
and a reduction is involved, blotting away the bath variables. Thus, we 


consider the Hamiltonian 


H = Hs + Hp + Ai, (8-345) 


where Hs, refers to the system, Hz to the bath and A; represents the 


interaction 


The right-hand side of the above equation is written in shorthand: the 
three terms operate in different state spaces; in particular, the inter- 
action term operates in the composite space of S and B, referred to as 
a direct product space. However, not all vectors in the direct product 
space can be represented as direct products of vectors belonging to 


the two factor spaces. 
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If jn) (n = 1,2,---) denote the eigenvectors of Hs with energies E,,, and 
la) (a = 1,2,---) the eigenvectors of Hg with energies ¢,, then |na) (n,a = 
1,2,---) area set of basis vectors in the composite space. The Pauli master 
equation, obtained once again by the application of the Fermi golden rule 


in the composite space, reads 


one _ > Winysnal my ( ee Wren (6) ) Ou Pigs 1); (8-346a) 
my 


where the notation is transparent. The transition probabilities conform 
to the principle of microscopic reversibility, and a microcanonical distri- 
bution for the composite system appears as the stationary solution of the 
above master equation. However, one can invoke a reduction at this stage 
by making a number of assumptions about the bath, whereby the bath 
variables are eliminated. Specifically, we assume that the probability dis- 
tribution for the bath at all times is independent of the system and is a 
canonical one corresponding to temperature 7’, denoted by P, = p, (this, 
in turn, involves assuptions about time scales of processes pertaining to 
the system and to the bath). In other words, we assume that P,,, is of the 


form 
Pog Papa (i, = 1s |. (8-346b) 


Substituting in (8-346a), and summing over the bath index a, one obtains 


the reduced master equation for the system S 


— > Wyse alt) — > ere Ca e862 (8-346c) 
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where the reduced transition rates wy,» (m,n =1,2,---) are given by 


Wno>m = .> Wresweabe (8-346d) 


ay 


and satisfy the ‘principle of detailed balance’, now in the form 


lie se Oo" = tipo (8-346e) 


(check these statements out). The stationary solution of this reduced 


master equation gives the canonical distribution for the system states: 
P, x e7 FFn (8-346f) 


(check this out too). 


Once again, the stationary solution describes the equilibrium state 
since any initial probability distribution can be seen to tend to the 


stationary distribution at large times. 


Further conclusions can be drawn by making additional assumptions 
about the bath and the interaction Hamiltonian Aj. In particular, assum- 
ing that the bath is made up of a large number of harmonic oscillators 
and that the interaction Hamiltonian is a product of a system operator 
and a bath operator (as in sec. 8.7.2.3, where one observes that the 7? 


term does not actually represent an interaction), one finds that the mas- 
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ter equation leads to results analogous to those obtained previously (refer 
to [150]). As a point of interest, if the system under consideration be a 
harmonic oscillator, then one finds that the mean energy of the oscilla- 
tor approaches the canonical equilibrium value regardless of the initial 


condition. 


The heat bath master equation can also be arrived at by the projection 


operator approach outlined in sec. 8.7.4 below. 


8.7.3.3 Master equation: general considerations 


A. The dynamical map. 


Master equations can be set up in general contexts, both classical and 
quantum, where one describes the time evolution of probability distribu- 


tions in the form of rate equations, e.g., in chemical kinetics. 


In sections 8.7.3.1 and 8.7.3.2 we have set up master equations in the 
form of rate equations that determine the time evolution of the probabil- 
ity distribution of unperturbed states of a system. The probabilities are 
described by the diagonal elements of its density matrix, where the latter 
involves no reference to the perturbing system, such as the bath with 
which it may be in interaction. This actually involves a double reduction 
— one blotting out the heat bath variables (recall the special assumptions 
about the heat bath mentioned in sec. 8.7.3.2) and the other blotting out 
the off-diagonal elements of the density matrix of the system (S) under 


consideration. In the following, we consider a more general form of the 
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master equation that describes the evolution of the off-diagonal elements 
as well. Conclusions arrived at on the basis of diagonal master equations 


are obtained from these more general ones as special instances. 


Referring to the situation in which the system S is in interaction with a 
bath (B), the density matrix p(t) of the composite system contains com- 
plete information about observables relating to both the system and the 
bath variables at any given time. On averaging over all possible bath 
states, one arrives at the reduced density matrix pg of the system whose 
time evolution contains no explicit reference to the bath. The master 
equation then describes the time evolution of this reduced density ma- 
trix, where off-diagonal elements are involved along with the diagonal 


ones. 


Considering a basis in the composite state space made up of product 
vectors of the form |na) (n,a = 1,2,---), where |n) and |a) are typical basis 
vectors in the state spaces of S and B respectively (commonly assumed 
to be eigenvectors of the respective Hamiltonians Hs, Hp), the elements of 


ps in the basis {|n)} are obtained as 


(As)mn = > (malp|na) (m,n = 1,2,-++). (8-347a) 


Qa 


This process of reduction from / to fs is referred to as taking a partial 


trace over B and is denoted as 


Ps = Trpp. (8-347b) 


1181 


CHAPTER 8. NON-EQUILIBRIUM STATISTICAL MECHANICS 


If the composite system made up of S and B is an isolated one, its den- 
sity matrix p(t) satisfies the equation (referred to as the von Neumann 


equation or, at times, as the Liouville-von Neumann equation) 


ine 6=(H, fl. (8-348a) 


which can be formally integrated to give (in the case of an isolated system, 
H does not depend explicitly on time; in the more general case of a closed 
system, Hs; may depend on time in virtue of its coupling to a classical 


field) 


(8-348b) 


The density matrix / can be looked upon as a vector in a bigger vector 
space (compared to the space of state vectors of the composite system), 
and the transformation from /(0) to A(t) expressed by (8-348b) as a linear 
transformation in this bigger space representing the action of a super- 
operator, i.e., an operator acting in the space of linear operators. One 
observes that equation (8-348a) can be written in a form analogous to 


the classical equation (8-91la) as 


d. a 
HP = ~ lH, A = —iL/. (8-348c) 


and the action of the super-operator representing the transformation 


from /(0) to p(t) (formula (8-348b)) can be expressed as 


a(t) = e~“*p(0), (8-348d) 
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where C is self-adjoint. This time evolution of the isolated composite 
system is in the nature of an unitary and reversible one. The super- 


operator L = —iCL is referred to as the generator of time translations. 


It now remains to take a partial trace to see how the reduced density ma- 
trix pg evolves in time. This, however, is a non-trivial task, and needs a 
number of assumptions and approximations before a reasonably useful 
evolution equation is obtained for js. In mathematical terms, the evolu- 
tion equation of the reduced density matrix js is given by 

in ps = Trs[H, fl. (8-349) 
This implies the existence of a mapping or a transformation /s(0) > fs(t), 
commonly referred to as the dynamical map in the space of self-adjoint 
operators pertaining to the system S. To put things briefly, it is the dy- 


namical map (let us call it V(t)) we are interested in. 


In virtue of the process of reduction the dynamical map does not, in 
general, represent a reversible dynamics as we saw in the case of the 
Langevin equation where the resistive force, arising as an effect of the 
bath variables on the dynamics of S, such as a Brownian particle, was 
a dissipative influence, causing the latter to settle down in an equilib- 
rium configuration at large times, with only the equilibrium thermal fluc- 
tuations remaining in its motion. Though one cannot say much more 
about the dynamical map, more specific statements can be made in con- 
crete situations, depending what approximations are admissible in the 


description of the system under consideration. 
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In a class of problems, the typical time scale of the relaxation processes in 
the bath is short compared to the time scale characterizing the system- 
atic part of the dynamics of S, when the dynamical map approximates 
a Markov process, i.e., the system dynamics does not show memory ef- 
fects. The dynamical map V(t) depending on the parameter t is then said 
to define a dynamical semi-group, as in the case of a classical Marko- 
vian process described by the Chapman-Kolmogorov equation (see, for 
instance, [14]). The characteristic feature of a semi-group is that the ap- 
plication of a dynamical map followed by another is equivalent to a single 
dynamical map (V(t2)V(ti) = V(t: +t2)), but there does not, in general, exist 


an inverse of a dynamical map. 


If the family of dynamical maps V(t) characterized by the parameter t 
forms a dynamical semi-group, then one can define a generator of time 


translations Lg such that the reduced density matrix js evolves as 


d 
— 4 — A - - 


However, in this case V(t)(= e”’) is not unitary and cannot be expressed 
in the form e~““‘, where CL is self-adjoint. 


B. Master equation: the Lindblad form. 


As mentioned above, the quantum dynamical map V(t) represents a Markov 
process only when a number of requirements expressing a certain spe- 
cial set of conditions are met with. Assuming that these requirements are 


satisfied, one can look at the issue of the most general form of the master 
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equation (8-350). 


Going to the basics of the problem, it is possible to formulate in quite 
general terms what these requirements should be and to check that these 
are indeed met with in any given situation of practical relevance. Once 
this is done, one can set up the master equation and can then work out 
how the reduced state of the system under consideration evolves in time. 
In particular, numerous set-ups in quantum optics do conform to these 
requirements, and their dynamics can be described, to a good degree of 
approximation, in terms of master equations. The mathematical steps 
necessary to effect the reduction from equation (8-349) to the master 


equation are said to constitute the microscopic derivation of the latter. 


In quantum optics the bath, or the reservoir, is constituted by the 
infinite multitude of plane wave modes of the electromagnetic field 
in open space that makes up the environment in which the state of 
an atom, that can for numerous purposes be looked upon as a two- 
state system, evolves in time. The atom is commonly enclosed in an 
evacuated cavity, in which case the atom plus the cavity field acts 
as the system interacting weakly with the reservoir field. The cavity 
field essentially consists of a single mode that can be looked upon as 
a simple harmonic oscillator while the atom is described as a ‘spin’. 
Given sufficient time, the spin and the oscillator comes to equilibrium 
with the environment. This constitutes the process of relaxation of 


the ‘damped spin’ and the ‘damped oscillator’. 


Briefly, the basic assumption from which the microscopic derivation pro- 
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ceeds is that of a small system S coupled weakly to a large system, 
namely, the bath B. Commonly, B is assumed to be a system in equi- 
librium, described by a time-invariant density operator pe ), Assuming 


that the composite system S+B is initially in the product state 
A(0) = ps(0) ® py, (8-351) 


the basic assumption mentioned above can be made use of in effecting 
a number of convenient approximations. On a very short time scale, the 
coupling between S and B leads to an entanglement between the two, 
and builds up correlations in the bath B. These correlations, however, 
are quickly smothered, typically over a characteristic time 7, (the cor- 
relation time of the reservoir), and B continues to remain close to the 
equilibrium state. One can thus assume that the coupling to B causes a 
non-negligible change in the reduced state of S but the state of B itself is 
not changed appreciably. One can then assume that, in an approximate 
sense, if the dynamics over a time scale smaller than 7, is glossed over, 


then the state of the composite system at time t is of the form f(t) ® py ), 


One other approximation involved in the microscopic derivation is based 
on the assumption of a weak coupling between S and B, as a result of 
which the evolution equation can be expanded in a perturbation series 
in which one can retain only the terms up to the second order in the 
interaction Hamiltonian H;. The coupling, moreover, is assumed to be 
linear in two specific sets of operators pertaining to S and B, namely, 
ones that act as raising and lowering operators for either system in its 


energy basis. A third approximation is referred to as that of a Markovian 
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evolution of the state of S wherein the rate of change of the reduced state 
ps at time t is assumed not to dependent on the reduced state at times 
earlier than t. Strictly speaking, the evolution of js depends on the history 
of the evolution because of the cumulative effect of the correlations that 
build up between S and B. However, the fact that the bath returns to its 
equilibrium state within a very short time implies that the correlations do 


not last over a longer time scale. 


Finally, the microscopic derivation makes use of the rotating wave approx- 
imation. This requires that the characteristic time 7s over which the inter- 
nal dynamics of the system S, governed by its Hamiltonian Hs, causes its 
state to change appreciably be small compared to the relaxation time Trelax, 
i.e., the characteristic time over which the system-reservoir interaction 
causes the reduced state of S to change appreciably. If the rotating wave 
approximation is not made then the evolution equation for the reduced 
density operator contains terms oscillating with various Bohr frequen- 
cies, where a Bohr frequency signifies a difference of the form w’ —w, with 
hw, hw’ being typical eigenvalues of Hs. The rotating wave approximation 
amounts to averaging out these oscillations on the assumption that these 


are fast compared to the relaxation of js. 


On making all these approximations, one arrives at the master equation 
for the system S coupled to the reservoir R. The master equation appears 
in a certain standard form, referred to as the Lindblad form. The form 
of the latter can be deduced from general considerations without refer- 
ence to specific details of the systems S and R and to the interaction 


Hamiltonian A; (i.e., without going through the steps of the microscopic 
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derivation), based on a set of assumptions regarding the dynamical map 


in the state space of S. 


It may be mentioned in this context that evolution equations more gen- 
eral than ones in the standard form can be set up for the reduced density 
matrix from which useful information can be extracted regarding the dy- 
namics of the system S, and some of these are generally referred to as 
master equations. For instance, one can arrive at a master equation hav- 


ing a more general form if one drops the rotating wave approximation. 


In summary, one arrives, by means of a microscopic derivation, at a mas- 
ter equation in the standard form under the assumption of a weak cou- 
pling between S and R, provided the relavant time scales (7g, 7s, Trelax, SCC 


above) satisfy the inequalities 
TB << Trelax; TS << Trelax: (8-352) 


It now remains to outline what the standard form of the master equation 
is. Briefly, the master equation can be expressed in the form 

d 1 2 1 1 

70s = Fllts Ps] + Dre (Le sly — shi Las — sPshiLs). (8-353) 

k=1 

In the expression on the right hand side, all the operators pertain to the 
system S under consideration (i.e., act in the state space of S), while, 
at the same time, carrying the imprint of its coupling to the reservoir R. 
In the first term of this expression, H4 is a Hermitian operator that may 


differ from the Hamiltonian (Hs) of S and represents a renormalization 
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of the latter due to the interaction with the bath. While this term corre- 
sponds to a unitary and reversible time evolution of the reduced state of 
S, the next term, involving the N operators L;, (k = 1,2,--- , N) (referred to 
as Lindblad operators or jump operators) represents an irreversible and 
dissipative evolution, and is at times referred to as the dissipator. The 
number N is an integer less than or equal to d? — 1, where d is the dimen- 
sion of the state space of the system S (we assume that the state space 
is finite dimensional for the sake of simplicity). Finally, y, (k = 1,2,---,N) 
are a Set of positive constants characterizing the dissipator. The assump- 
tions relating to the nature of the mapping V(t) mentioned above do not 
determine the operators H’ and L,, or the constants Yr, but only deter- 
mine the form of the evolution equation as stated above, this being the 


standard, or the Lindblad form of the master equation. 


The renormalized system Hamiltonian ae is commonly referred to 
as the Lamb shift Hamiltonian since it accounts for the Lamb shift 
of the energy levels of an atom interacting with the radiation field 
constituting its environment. It commutes with the un-renormalized 


Hamiltonian Hs. 


In order to determine the operators Ht, and +;, Lp (k = 1,2,---,N) one 
has to look at the details of the systems S and R as also at the interac- 
tion Hamiltonian Aj, and to go through the microscopic derivation of the 


master equation. 


Given the parameters characterizing the master equation (8-353), one 


can solve it (analytically or numerically) so as to see if it approaches a 
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stationary state at large times. Assuming that the bath state As ) is one 


of thermodynamic equilibrium at temperature 7, one can, by referring 
to the microscopic derivation, indeed establish the general result that 
ps(t) approaches a state of equilibrium at the same temperature T (refer 


to [14], section 3.3.2). 


Incidentally, the master equation in the general form (8-353) reduces to 
the simpler diagonal master equation (8-346c) if the eigenvalues of the 
system Hamiltonian Hs are non-degenerate, i.e., in this case the diagonal 
elements P,,(t) of the reduced density matrix jg, expressed in the energy 
basis, form a closed system of equations. In this case the evolution of the 
diagonal part of fjs(t) decouples from that of the off-diagonal part, and the 


relaxation to thermal equilibrium appears as a direct consequence. 


The master equation (8-353) resulting from the Markovian semi-group 
property of the dynamical map V(t) implies that it can be expressed 


in the general form 


d . : 
a Lps, (8-354) 


where the super-operator L (the generator of time translation) is time- 
independent. One can generalize to the case of coupling to a periodi- 
cally driven external field by referring to what is known as the Floquet 


basis ( [14], chapter 8). 


C. Quantum Brownian motion: the Caldeira-Leggett model. 


The rotating wave approximation makes it possible to describe the evo- 
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lution of the reduced density matrix in terms of a Markovian equation 
defining a quantum dynamical semi-group. This approximation is based 
on the assumption that the intrinsic dynamics of the system S is fast 
compared to the relaxation process resulting from the interaction with 
the bath B (ts << Trelax), Which is the typical state of affairs obtaining in 
quantum optic set-ups. This differs from the situation typical of another 
class of problems where the intrinsic system dynamics is slow compared 
to the decay of correlations in the bath while the latter continues to satisfy 
the inequality (t3 << Trea). These are the conditions characterizing the 
process of ‘quantum Brownian motion’ — essentially the one considered 
in sec. 8.7.2.3 described by the system-bath model (8-333), starting from 
which Caldeira and Leggett derived a master equation for a harmonic os- 
cillator coupled to a bath that differs from the the Lindblad form (8-353). 
Numerous applications in solid state physics involving strong system- 
environment coupling at high temperatures come under the purview of 


such a description. 


Details of derivation of the Caldeira-Leggett master equation are to be 
found in [14]. It transpires that the time evolution of the reduced den- 
sity matrix depends on the way the bath correlation functions of the 
form ({B, B(r)]) and ({B, B(r)}) depend on r, where B = — >, 9:4 (refer 
to (8-333)), and where {-,-} denotes the anti-commutator of two opera- 
tors. It is at this point that the spectral density of the bath frequencies 


assumes relevance. Defining 


Jw) => 7 5(w ui, (8-355) 
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one finds that the condition Tg << Tex holds for J(w) «x w (the so-called 
Ohmic spectral density; check that this is precisely the condition un- 
der which the Markovian form of the Langevin equation was obtained 
in sec. 8.7.2.3). It may be mentioned that that an additional high fre- 
quency cut-off (Q) is necessary in the bath correlation functions for the 
Born-Markov approximation to be applicable. This approximation holds 
even in the case of relatively strong bath-particle interaction provided the 
temperature is sufficiently high. Finally, the typical Bohr frequency for 
the intrinsic particle dynamics is to be small relative to AQ and to kpT' so 


as to arrive at the Caldeira-Leggett master equation. 


It may be noted that the assumption of a slow intrinsic dynamics 
was not made in deriving the Langevin equation in sec. 8.7.2.3. It is 
however, a necessary condition for the Caldeira-Leggett equation to 


be a valid description of the evolution of the reduced density matrix. 


We introduce the parameter y in terms of the spectral function J(w) as 


amy ee 
Ww ? 
qt OF +? 


I(w) == (8-356) 


where the high frequency cut-off parameter 2 is included through the 


so-called Drude-Lorentz cut-off function. 


The Caldeira-Leggett master equation then appears in the form 


d . aan iy ; 2nykpT P 
qPstt) = —_[Hs, bs] = 7, | {p, Ps}] — —a |e {z, ps}. (8-357) 
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In this equation, the first term on the right hand side, analogous to the 
corresponding term in (8-353), represents the intrinsic dynamics of the 
system S, the second term describes the relaxation under the influence 
of the bath, while the third term assumes relevance in describing the 
process of decoherence in which the off-diagonal elements of the reduced 
density matrix decay at large times. The Caldeira-Leggett equation differs 
from the Lindblad form of eq. (8-353), but can be converted to that form 
by adding a small term that is small in the high temperature limit (refer 


to [14]). 


On writing the Caldeira-Leggett equation in the position representation, 
one obtains the result that the off-diagonal elements of its stationary 
solution ps(z, x’) (for which is = 0), decay exponentially with the distance 
|x — x'| from the diagonal, and the diagonal elements correspond to the 


equilibrium canonical distribution at temperature 7’. 
D. Irreversibility. 


We continue to follow [14] in explaining the idea of irreversibility and 
the related measure of entropy production in non-equilibrium processes, 
assuming that the Markovian semi-group property pertaining to the dy- 


namical map V(t) holds. 


To start with, we consider the relative entropy S(A||w)) of any two states 
(, wv of the system S and see how the action of V(t) on these states affects 


the relative entropy. 
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The relative entropy of a state f with reference to a state 7 is defined as 


S(Al|) = Tr(6ln p) — Tr(6In¥), (8-358) 


where both the terms on the right hand side are well defined in virtue of the 
fact that the eigenvalues of a density matrix are zero or positive, having a 
sum unity. It is a useful construct in quantum statistical mechanics and 
satisfies a number of important relations making for its relevance. Thus, 


one has the basic inequality 


S(Allb) > 0, (8-359) 


for all f, . Moreover, the relative entropy is invariant under a unitary trans- 
formation U, as is the von Neumann entropy of a state, introduced in chap- 


ter 2 (eq. (2-7)): 


S(Allb) = S(O pv" ||7GU"). (8-360) 


What is more, if and 7) are states of a composite system made up of com- 
ponents A,B, then the relative entropy decreases or remains unchanged on 


reduction to either A or B: 


S(A||b) > S(TrpA||Trsy). (8-361) 


Continuing to refer to a composite system made up of subsystems A, B, if 


p), p®) are density matrices of the two subsystems and / is one pertaining 


to the composite system, the following inequality is seen to hold: 


SAA & p™) = S(O) + 86) — 5(0), (8-362) 


where f is a state of the composite system and jp“ @ f®) stands for the 
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direct product of states of the subsystems A, B. 


On making use of the properties of the relative entropy one finds that 
SHAY) < Sl). (8-363) 


In other words, evolution under the dynamical map decreases the relative 
entropy. In particular this result tells us that the entropy relative to a 
stationary state ~ (i.e., one for which V(t):) = ¥), decreases under the 


action of the dynamical map. 


Referring to a stationary state 7) of an open system (of which an equilib- 


rium state constitutes a particular instance), we introduce the quantity 
, Cicaes 
o(A(t)) = —ke5 S(A()|IY), (8-364) 


where f(t) = V(t)A(0), A(0) being an initial state close to 7. Then from the 
above property of decrease of the relative entropy under the action of the 
dynamical map, it follows as a consequence that o is necessarily non- 
negative. This suggests the interpretation that o represents the entropy 
production under the dynamical map close to a stationary state. Indeed, 
if ) represents a state of equilibrium at temperture T, and $(A(t)) denotes 
the von Neumann entropy of the reduced state p(t) (recall that we are now 
considering the time evolution of an open system), then it turne out that 


a balance equation of the form 


= 600) eee. (8-365a) 
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holds, where 


J = kpTr((Lf) Inv), (8-365b) 


has the interpretation of representing the entropy flow out of the sys- 
tem. This is in accord with the principle of entropy balance outlined 
in sec. 8.2. In the above expression, L stands, as before, for the super- 
operator representing the generator of time translation under the dynam- 


ical map (V(t) = e”). 


The fact that o is a non-negative quantity can be looked upon as the 
fundamental expression of irreversibility of time evolution under the dy- 


namical map Y(t). 


8.7.4 The projection operator approach: a brief outline 


The content of the present section is based principally on [150], [140]. 


The projection operator approach is associated with the names, among 


others, of Nakajima, Mori, and Zwanzig. 


We consider a system S and focus on the time evolution of observables 
characterizing the system. The aim is to identify a certain subset of ob- 
servable quantities (A, A2,---) and to try to form a closed set of equations 
involving these observables alone (to be referred to in the following as the 
‘relevant’ ones), while ignoring the dynamics of a complementary (‘irrel- 
evant’) set of observables (B,, B2,---). The effect of this complementary 


set appears in the equations for the relevant observables in the form of 
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memory terms and random force terms as in the generalized Langevin 


equations (sec. 8.7.1.4; see also sec. 8.7.2). 


The present context differs from the one in which S appears as a subsys- 
tem of a composite system (call it C) made up of S and a bath B — itis S 
itself that now appears in the place of C, and no explicit reference is made 
to a subsystem and a bath. Instead, we consider the subset of the relevant 
observables of S and try to find the effect on the dynamics of this subset 
that arises due the complementary set of irrelevant observables, where this 
effect appears only through a set of initial conditions. The case of reduction 
from a composite system C to a subsystem S appears as a special instance 
of this more general approach where one focuses, among the set of observ- 
ables of C, on the observables of the subsystem in question that now form 
the relevant set, while the observables of the bath constitute the comple- 
mentary, irrelevant set. In either of these two types of situations (one where 
one considers a system S, not necessarily made up of a subsystem and a 
bath, and the other where such a decomposition is involved, with the bath 
variables playing the role of the irrelevant set of observables), a set of con- 
ditions needs to be satisfied in order that the approach may be a useful and 


effective one. 


We confine our considerations here to the classical context. The quantum 


mechanical formulation of the theory is formally of an analogous nature. 


The observables of a system are functions of the phase space variables 
Q, P, where Q stands for the collection for the position co-ordinates of its 


constituent particles and P for the collection of the corresponding mo- 
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memnta. In the following, we denote the two sets together by the symbol 
X which stands for the set of 6N phase space variables for a system of NV 
number of particles. While denoting the integral of some function of the 
phase space variables, the infinitesimal phase space volume element will 


be denoted by the symbol DX. 


This marks a slight departure from the notation used in earlier sections of 
the book, where the phase space co-ordinates were collectively denoted by 


z instead of X. 


The set of observable quantities of the system under consideration make 
up a linear vectorspace, where the sum of any two observables A(X), B(X) 
and the product of an observable with a number are defined in the usual 
way. For the sake of generality,we admit of complex functions of X, 
though physical observables of interest are described by real functions 
(refer back to sec. 8.4.2.1, which includes similar considerations, though 
in a slightly different notation). The scalar product of two function A(X), B(X) 
will be denoted by (A, B). The following definition of the scalar product is 


found to be useful in the present context, 
(A,B) = f DXp(X)A'(X)B(X), (8-366) 


where p(x) is a probability distribution, an appropriate choice for which 
is the equilibrium distribution function corresponding to a canonical en- 
semble at some specified temperature T. 


We now assume that, based on some physical requirement (which we 
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need not specify now), we have identified a number of observables Aj, Ay,---, 
forming a linearly independent set {A}. For the sake of generality, this set 
need not be orthonormal with respect to the above scalar product (though 
it can be converted to an orthonormal one by the so-called Gram-Schmidt 
procedure). In order to indicate that the observables, looked at as nor- 
malizable functions on the phase space form a vector space (say, S), we 
represent them symbolically in the so-called Dirac notation, in which a 
function A(X) is denoted by |A). The relevant observables of interest form 
a subspace A of S, in which |Aj;),|A2),--- form a linearly independent set. 
The orthogonal subspace to A in S will be denoted by A. A vector |B) in 
S will, generally speaking, have components in both A and A. We will be 
interested in the projection in A that can be obtained by the action of a 


projection operator on the vector |B). This projection operator is given by 


P= STAYIN) is(Al (3367) 
aj 


where the elements of the matrix NV are formed of scalar products of pairs 


of vectors chosen from the set {A}: 
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That P is a projection operator is easily established: 


P? = S7|Ai)(N™)ig(Ajl An) NO) Al 


ig,kl 


= Se |Ai) (Nag NGe(N "Da (Ail = P- (8-368) 


ig,kl 


(check also that P|A;) = |A;) (¢ =1,2,---)). 


If the phase space functions A; (i = 1,2,---) were chosen to form an or- 
thonormal set of vectors in the space S of phase space functions, then the 


projection operator P reduces to the familiar expression 
P= S Ai) (Ail, (8-369a) 
since, in that case, one would have had 


Nig = 633 (9 =1,2,-+*), (8-369b) 


i.e., both NV’, N—! would be unit matrices. 


The projection operator on the complementary subspace JA is 
Q=I-P, (8-370) 


where / stands for the unit operator in the full space of functions S = 


AOA. 


In the following, we consider an observable |A) belonging to A and focus 


on its time evolution. Though |A) is a relevant observable , it will, in 
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general, move out of A in the course of time. Starting from an initial 
time ¢t = 0 we construct the evolved vector (recall that observables of the 
system S are being treated as vectors here) at time t by making use of the 
appropriate propagator (see below) and then project the evolved vector 
back to A. This expresses the evolution of vectors in A entirely in terms 
of vectors in A itself, though one expects that the projection (which is a 
type of reduction, since we avoid making explicit reference to observables 
in A) bears imprint of the excursion of the evolved vector beyond A. It 
turns out that, for an appropriate choice of the observables in A, one 
ends up with a generalized Langevin equation with a memory term and a 


‘random force’ term. 


It may be of some help to point out that, in the case of the particle in a liq- 
uid bath, the system of interest from the point of view of the present projec- 
tion operator approach is the composite object made up of the particle and 
the bath, while the relevant observables are those pertaining to the particle 
alone, the bath variables being the irrelevant ones that are projected out. In 
this sense, the generalized Langevin equation arrived at in sec. 8.7.2.2 con- 
cerning the evolution of the position and momentum of the particle, can be 
looked upon as an instance of the way the more general projection method 


would work, though the derivation followed a different route there. 
The time evolution of an observable |A) is given by 


<A) = i£lA), (8-37 1a) 


where CL is an operator in the space of observables (S) whose action is 
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described, in the more familiar notation as 


iLA(X) = {A, H}, (8-37 1b) 


where H(X) stands for the Hamiltonian of the system S, {-,-} stands for 
the Poisson bracket of two phase space functions, and where we have 
temporarily dropped the Dirac notation in order to remind ourselves that 
the object |A) is actually a phase space function A(X). The above formula 
expresses the time evolution of an observable in a Heisenberg type de- 
scription as in sec. 8.4.2.1 ( [150] makes use of the operator L = iL in S, 


where L is self-adjoint). 


In the complementary, Schrédinger type description of time evolution, 
the observables do not evolve with time, but a state of the system 
described by the phase space distribution p(X) does, in accordance 


with (refer back to (8-9 1a)) 


ap? = the = 1 Hf. (8-372) 


Reverting to the Dirac type notation for the sake of clarity, one obtains, 


from (8-37 1a), 


|A(t)) = e**|A(0)), (8-373) 


where ec‘ is the propagator representing time translation through an in- 
terval t, and describes a unitary evolution in S. We are now in a position 


to derive a generalized Langevin equation for an observable A(X) repre- 
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sented by a vector |A) belonging to A, i.e., to the set of relevant observ- 
ables formed as linear combinations of members of the set {A}, whose 


choice is left unspecified at this stage. 


The following operator identity can be verified by differentiation and use 


of initial condition at t = 0: 
eth — ettQh 4 if dre pero’ (8-374) 
0 


We now apply the two sides of this identity on iQL£|A) and make a number 
of operator manipulations (see [150]) so as to arrive at the final result that 


involves the matrices 9 and k(t) defined as follows: 
05 =D (AILIAR) (Nag (9 = 1,2,--+), (8-375a) 
k 


(where the matrix WV is defined as in (8-367b)), and 


Kij(t) = -i “(FB LIAg)*(N ag (6,9 = 1,2,--*), (8-375b) 
k 
in which |F;(t)) is defined as 
|F,(¢)) = ie*©°OL|A;) (i = 1,2,---). (8-375c) 


With these definitions, the generalized Langevin equation appears as 


t 
= A(t) =-i)°9;4;- >> | drK,;(r)A;(t — 7) + F(t), (8-376) 
j g 
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where, in this final form, we have once again dropped the Dirac notation 
so as to conform to the more commonly followed norm. Notice that all 
the terms on the right hand side refer to observables belonging o the 
relevant set alone, the latter being the one formed by linear combinations 


of observables belonging to {A}. 


Eq. (8-376) is formally of the nature of a generalized Langevin equation 
(refer, for comparison, to (8-326a)), in which the first two terms resem- 
ble the systematic part of the evolution of the observable A(t) including 
a memory term, while the third term resembles a random force. How- 
ever, the resemblance is only formal since, up to this point, the equa- 
tion is essentially a re-formulation of the reversible dynamics expressed 
by (8-371a). In actually applying it to a concrete system S, one has to 
choose properly the set of relevant observables (the complementary set 
of irrelevant observables do not appear explicitly in (8-376)) and make 
suitable approximations, analogous to the assumption of an initial dis- 
tribution of bath states in sec. 8.7.2.2, so that the terms on the right 
hand side become physically meaningful. For instance, one may choose 
the relevant variables to be ones that may be considered to be slow as 
compared to the irrelevant variables, in which case the projection opera- 
tion acquires significance. On making the physical identification of F(t) 
as a random force, one can also show (refer to [140]) that the memory 
kernel k(t) is related to the auto-correlation of the random force, thereby 


expressing the fluctuation-dissipation theorem in the present context. 


A feature of particular relevance in the present formulation is that, the 


Langevin equation one arrives at is inherently a linear one, in contrast 
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to (8-326a) which is, in general, nonlinear in virtue of the force term 


For further details and concrete applications, including ones relating to 


hydrodynamics, refer to [84], [140], [150]. 


8.8 Non-equilibrium statistical mechanics: the 
approach to equilibrium 


In closing this chapter I return to the basic question of non-equilibrium 
statistical mechanics: why does an isolated system, left to itself from any 


initial configuration, approach the state of equilibrium? 


Taking into account all that has been said in this chapter, the closest 
response that one can make to this basic question is contained in the 
statement that the thermodynamic equilibrium state of a system is one 
of maximum entropy as compared to states of constrained equilibria com- 
patible with the conditions that define the equilibrium state in question. 
In other words, the entropy S(Xj, X2,:-- , X;) specified in terms of given 
values of the macroscopic variables X,, X2,--- , X;, where k is a given pos- 
itive integer, is larger than the entropy of the state defined in terms of 
a number of macroscopic variables specified in addition to X,, Xo,--- , Xx. 
In relation to the equilibrium configuration specified by X,, X2,--- , X;, all 
these other states defined in terms of values of additional variables are 
referred to as constrained equilibria. In other words, the fundamental 


thermodynamic principle underlying all our considerations up to now is 
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the stability of the equilibrium state under given constraints, expressed 
by means of the principle of maximum entropy as stated above (refer to [15] 
for a complete and lucid exposition). This, however, is nothing more than 
what is already contained in the second law of thermodynamics, which 
includes as an in-built principle the statement that all natural processes 


have a preferred direction. 


Statistical mechanics aims at explaining the principles of thermodynamics 
by starting from the microscopic dynamics of thermodynamic systems. 
The pioneering and fundamental contributions of Boltzmann, Gibbs and 
Maxwell, along with those of other contemporaries, is summarized, on the 
one hand, by the Boltzmann formula for the equilibrium entropy (see, 
for instance, [88]) and, on the other, by the microcanonical ensemble 
(and all the other related ensembles, chapter 2) of equilibrium statistical 
mechanics, both of which incorporate the maximum entropy principle 
(see sections 2.1.2, 2.1.5.1, and 2.1.5.3; the principles outlined in these 
sections in the quantum context can all be stated in the classical context 


as well — check this out). 


Thus, non-equilibrium processes in an isolated system in the near-equilibrium 
regime are all characterized by the property of regression to equilibrium 
precisely in virtue of the second law of thermodynamics or, when looked 
at from the point of view of statistical mechanics, in virtue of the equilib- 
rium state being described by the microcanonical ensemble. An equiva- 
lent way of saying this is that, in this near-equilibrium regime, the en- 


tropy production is necessarily a positive definite quantity. 
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1. As we have seen in sec. 8.5, the requirement of positive definiteness of 
the entropy production is equivalent to the condition that the quadratic 
form )/,; LijAiA; has to be positive definite, i.e., in other words, the 
eigenvalues of the matrix with the kinetic coefficients L;; as elements 
are to be positive. In the case of a simple fluid this corresponds to the 
requirement that the transport coefficients K,7,¢ are to be all positive. 
As mentioned earlier, this is related, in the ultimate analysis, to the 
second law of thermodynamics which is incorporated into the equi- 
librium ensembles of statistical mechanics in the form of the relevant 


variational principles. 


2. Recall that the positive definiteness of the entropy production also im- 
plies that a non-equilibrium steady state close to a state of equilibrium 


corresponds to a minimum value of this quantity. 


Put differently, the second law of thermodynamics expresses the irre- 
versibility of natural or spontaneous process in terms of the entropy 
principle: the entropy of the equilibrium state of a system under given 
constraints is maximum as compared with equilibrium states subject to 
additional constraints, a spontaneous process being one where the sys- 
tem attains the equilibrium state in the long run once the additional con- 
straints are lifted. All this is reflected in the microcanonical ensemble 
(or any other ensemble of equilibrium statistical mechanics equivalent 
to the microcanonical one) describing the equilibrium state and in the 


Boltzmann formula for entropy. 


In interpreting this general feature of macroscopic systems in terms of the 
microscopic dynamics governed by the known laws of physics, Boltzmann 


formulated the ergodic hypothesis, to be outlined in chapter 9. Assuming 
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that the microscopic dynamics conforms to the ergodic hypothesis, it fol- 
lows as a consequence that the state of the system under consideration 
is described by the microcanonical ensemble in the long run which, as 
mentioned above, is consistent with the provisions of the second law of 


thermodynamics. 


However, the ergodic hypothesis does not imply that the dynamics of the 
system described in the phase space in terms of the Liouville equation 
results in an irreversible approach to the equilibrium state as implied by 
the second law. Indeed, the Liouville dynamics is a manifestly reversible 
one and, even if one assumes that it satisfies the additional requirement 
implied by the ergodic hypothesis it still does not engender the empiri- 
cally observed irreversibility that the second law codifies: if the system is 
released from a state of constrained equilibrium by lifting the additional 
constraints mentioned above, the Liouville dynamics (along with the pro- 
visions of the ergodic hypothesis) is not consistent with the spontaneous 


approach to the equilibrium state, as observed empirically. 


In this context, we will have a look at the more restricted requirement 
of mixing in the next chapter. The idea of mixing was introduced on 
an intuitive basis by Gibbs. Assuming that the microscopic dynamics, 
described by the Liouville equation in the phase space, satisfies the addi- 
tional condition of mixing, something akin to the spontaneous approach 


to equilibrium can be established. 


This, however, still falls short of answering the fundamental question 


of statistical mechanics posed at the beginning of the present section. 
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Mixing or not, the Liouville dynamics continues to be reversible in nature, 
and it is only in the sense of a coarse-grained description that the feature 
of mixing implies the approach to equilibrium observed empirically in 
natural processes. This too will be briefly explained in the next chapter 
where we will see that the coarse-graining is a fundamental requirement 
introduced in the theory by the fact that the equilibrium state is defined 
only in macroscopic terms where the detailed dynamics in the phase space 
is partially glossed over. Put differently, the equilibrium state is defined 
not directly by the microcanonical ensemble, but by averages of phase 


space functions evaluated with reference to it. 


In a sense, the thermodynamic description itself is in the nature of a con- 
tracted one, when compared with the microscopic description of the dynam- 
ics of a system. The former applies to a system of a large number of degrees 
of freedom, corresponding to which the detailed microscopic dynamics in- 
volves an enormous number of variables, while the thermodynamics of non- 
equilibrium processes makes reference to only a few slow variables of the 


system. 


Finally, the question remains as to whether the feature of mixing (which 
implies ergodicity) is actually conformed to by systems of interest in sta- 
tistical mechanics. Here again, an answer in the affirmative would be 
welcome while in fact it is, to all intents and purposes, in the negative. 
In other words, few systems, if any, in statistical mechanics conform to 
the requirement of mixing. While a number of model systems have been 


shown to be of the mixing type, it is not clear whether and to what extent 
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these are relevant in the context of real-life systems in statistical mechan- 
ics. In the absence of an explicit demonstration that the requirement of 
mixing (or of chaotic dynamics of a certain general type) is satisfied by 
actual systems of interest in statistical mechanics (such as a fluid in a 
container), one postulates that these systems are effectively mixing (or 
effectively chaotic) in virtue of the large number of degrees of freedom 


characterizing them. 


In the next chapter we engage with the question of linking up features of 
the microscopic dynamics of a system with those of its macroscopic states, 
where the idea of a macroscopic state is based on the expectation value of 
phase space functions with reference to certain natural measures in the 
phase space. Here one distinguishes between isolated and open systems. 
In the case of an isolated system the natural measure is the Liouville 
measure in the phase space (See sec. 9.2.2 in the next chapter), while in 
the case of an open system, one has to refer to a more general class of 
measures known as Gibbs measures. In particular, for a class of open 
systems referred to as thermostated ones the relevant natural measures 
are of the Sinai-Ruelle-Bowens (SRB) type. Though the microscopic dy- 
namics for a thermostatted system differs from the one described by the 
Liouville dynamics, the SRB distribution is still considered to be of great 
relevance for open systems, especially in respect of steady states far from 


equilibrium. 


The term Gibbs distribution was introduced in sec. 5.6 in the con- 


text of the thermodynamic limit as a generalization of the idea of 
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the canonical ensemble, originally introduced for finite systems. The 
Gibbs measure referred to in the above paragraph is an extension to 
open systems which reduces to the canonical distribution in the case 


of an isolated system. 


A few results in non-equilibrium statistical mechanics pertaining to the 
question of the emergence of macroscopic irreversibility from microscopic 


reversibility of the equations of motion of a system will be outlined in 


chapter 9. 
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Chapter 9 


Dynamical chaos and Classical 
Statistical Mechanics: An 


Overview 


9.1 Introduction 


Equilibrium and non-equilibrium statistical mechanics, including the de- 
scription of far-from-equilibrium situations, refer to systems differing 
widely in their microscopic dynamics, resulting from the diverse specific 
ways their constituents interact among themselves. In trying to under- 
stand and explain the macroscopic behavior of systems, statistical me- 
chanics aims at formulating a number of basic principles from which the 
observed macroscopic behavior can be arrived at. In the case of equilib- 
rium properties of systems, these basic principles relate to the equilib- 
rium ensembles of statistical mechanics. As we have seen in chapter 8, 


non-equilibrium behavior close to equilibrium states can be explained on 
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the basis of the linear response theory based on fluctuations in the equi- 
librium ensembles. We will, in a number of subsequent sections in the 
present chapter, see whether and how far analogous principles can be 
formulated in respect of macroscopic non-equilibrium behavior far from 


equilibrium. 


One can ask whether these basic principles of statistical mechanics can, 
in turn, be related to certain common characteristics of the microscopic 
dynamics of the systems considered. In particular, it seems worthwhile 
and rewarding to explore if the feature of dynamical chaos in the micro- 
scopic dynamics of systems made up of a large number of particles can be 
identified as such a common and unifying characteristic, leading to the 
basic principles of equilibrium and non-equilibrium statistical mechanics 
mentioned above. As we will see, such exploration has led to a number 
of significant results and to deep insights into the issue of the relation 
of the principles of statistical mechanics to the microscopic dynamics of 
systems. However, no fundamental result of an abiding relevance cover- 
ing a sufficiently wide area of inquiry can be said to have been arrived 
at. The present chapter is aimed at an elementary exposition of a few of 
the basic issues involved in this foundational area that is still very much 


open. 
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9.2 Microscopic dynamics and equilibrium sta- 


tistical mechanics 


9.2.1 Microscopic states 


In the classical description, the most complete specification of the micro- 
scopic state of a system is in terms of a point in its phase space, referred 


to as a pure state. 


Starting from a pure state (Q(0), P(0)) (see below for notation) at some 
initial time t = 0, the Hamiltonian equations of motion for a system made 
up of N number of particles (refer to (1-8), in which s = 3N) generate a 
trajectory in the phase space, describing the time evolution of the pure 
state. For an isolated system with a specified energy E, the trajectory 
of the point in the phase space representing the instantaneous state is 
confined to a (6N — 1)-dimensional surface in the 6N-dimensional phase 
space. However, a system can seldom be isolated completely from its 
surroundings and one often has to refer to a system whose energy is 
confined within a narrow range from, say, EF to E+6E (SE << E), in which 
case the trajectory of the representative point is confined to an ‘energy- 
shell’, which is a thin slice of the phase space lying between two energy 
surfaces (see fig. 9-1; the surface corresponding to energy E is referred to 
aS yz). The energy shell (and hence the energy surface too) is a bounded 
region in the phase space for a system confined within a finite volume V 
(reason this out; this comes about in virtue of the stability criterion on the 
potential and of the fact that each of the momentum variables remains 


bounded when the energy is fixed). 
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energy surface 


energy shell-’ 


Figure 9-1: Depicting schematically a thin energy shell of thickness 6E (with the 
energy lying between F and E+ dE) in the phase space; the former is a thin slice 
of the phase space between energy surfaces corresponding to energies E and 
E+ 6E; the phase point for an isolated system, in the course of time evolution, 
stays on an energy surface (with some energy EF), which can be considered as 
the limit of a shell of vanishingly small thickness 6F. 

However, for a system with a very large number of particles (NV ~ ov), it is 
meaningless to confine one’s attention to microscopic states represented 
by points in the phase space since it is impossible to specify the positions 
and momenta of each individual particle with infinite precision. This fact 
forces upon us the necessity of referring to mixed states described by 
probability distributions (fig. 9-2), where the idealization of a pure state 
represented by a point is replaced with a distribution concentrated over 
a small region of the phase space around that point. The probability 
distribution may be thought of as being confined to a small patch in 
the phase space, each point in the patch moving in accordance with the 
Hamiltonian equations of motion. It is, however, not necessary to assume 


the probability density at points outside the patch to be zero, assuming 


instead that the density is vanishingly small at points away from it. 


A mixed state described by a probability distribution over possible pure 


states of a system is alternatively referred to as an ensemble where the 
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Figure 9-2: Depicting schematically a small patch in the phase space repre- 
senting a probability distribution concentrated over a small region in it; such a 
probability distribution corresponds to a mixed state that approximates closely 
a pure state represented by a point in the phase space; the latter is an ideal- 
ization, not appropriate for the description of a system with a large number of 
particles, since a specification of the exact state of each particle of the system 
is not within the realm of possibility; numerous points within the patch repre- 
senting the mixed state are shown, each of which moves in accordance with the 
Hamiltonian equations of motion (bent lines with arrows); the density distribu- 
tion need not drop abruptly to zero outside the patch, but may be assumed to be 
vanishingly small away from it; the time evolution of the probability distribution 
in the phase space is described by the Liouville equation (refer back to 8-4); at 
large times, the probability distribution is likely to get spread over the entire 
phase space. 


ensemble is imagined as a collection of copies of the system in which 
each copy is in some possible pure state, occurring with some specified 
probability (or, more specifically, with some specified probability density) 
in the collection. As we have seen throughout the earlier chapters of this 
book, the entire subject of statistical mechanics is based on the concept 
of probability distributions, which provides a unified description of mi- 
croscopic and macroscopic states of a system, as also of equilibrium and 


non-equilibrium configurations of it. 


The terms ‘mixed state’, ‘probability distribution’, ‘probability density’ ‘phase 
space distribution’, ‘density distribution’ (in the phase space), ‘phase space 


density (distribution)’ and, simply, ‘distribution’ have all been used synony- 
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mously in this book.The term ‘probability measure’ is also used in the liter- 
ature to the same effect. In the present introductory exposition, we ignore 
finer mathematical issues and focus on the basic ideas involved in explain- 
ing the physical principles and concepts of equilibrium and non-equilibrium 


statistical mechanics. 


9.2.2 The ergodic hypothesis and equilibrium states 


Historically, the idea of phase space distributions is imputed to Gibbs, 
though Boltzmann seems to have been well aware of it. However, he 
preferred to couch his arguments in terms of finite regions of the phase 
space. Starting from the irrelevance of a single point in the phase space 
in describing the microscopic state of the system under consideration, 
he brought up the idea of an approximate description of a microscopic 
state (a ‘microstate’ as it is referred to) in terms of small ‘phase cells’, 
where all the points within a phase cell are assumed to correspond effec- 
tively to a single microscopic state. This is equivalent to an approximate 
description of a microscopic state in terms of a normalized probability 
distribution that has a constant value within the cell and is zero outside, 
such that the total probability within the cell is normalized to unity. More 
generally, however, such a discontinuous probability density in the phase 
space may be replaced with a more well-behaved distribution that is con- 
centrated over the cell but has a vanishingly small non-zero value away 


from it. 


In this picture based on phase cells, the representative point in the 


phase space retains its notional value. Starting from an initial location 
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(Q(0), P(0)) in some particular cell at an initial time ¢t = 0, the representa- 
tive point describes a trajectory in the phase space as determined by the 
Hamiltonian equations of motion, passing through other cells in succes- 
sion. In this view based on phase cells, the trajectory of the phase point 


is replaced with the sequence of traversal of the cells. 


Boltzmann had the insight to see that the principles of thermodynamics 
can be described in a consistent manner in terms of the microscopic 
states and their time evolution provided that, in the long run, all the 
phase cells in the energy shell (or on the energy surface, depending on 
the context) are traversed uniformly, i.e., with a relative frequency that 
does not vary from one cell to another, assuming that all the cells are of 
an equal natural measure (see below; for an isolated system the ‘natural 
measure’ corresponds to the microcanonical distribution). Boltzmann 
developed the subject of equilibrium statistical mechanics (which sought 
to explain the principles of equilibrium thermodynamics) on this single 
fundamental assumption, which he referred to as the ergodic hypothesis. 
Another way to state the hypothesis is to say that, in the course of a long 
stretch of time 7 — ov, the time (7(A)) spent by the representative point 
within any region A of natural measure m(A) in virtue of its recurrent 


visits to this region is given by 


=k (9-1) 


where |M| stands for the ‘volume’ of the energy shell or the energy surface 


as the case may be. 
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The above formulation of the ergodic hypothesis is in keeping with what is 
referred to as the quasi-ergodic hypothesis of Ehrenfest who gave a system- 


atic formulation of Boltzmann’s original ideas. 


Digression: natural measure. 


From the mathematical point of view, the phase space is a man- 
ifold which means that it can be looked upon as a collection 
of small patches that are parts of Euclidean spaces, in each 
of which the ideas of length, surface areas of various dimen- 
sions, and volume can be formulated as in the case of our fa- 
miliar three-dimensional space (all these derive from the idea of 
a Riemann metric and the Riemann volume element). Since all 
these are additive, one can define integrals of functions defined 
over the phase space for finite regions and subsets of the man- 
ifold by adding up contributions from the various Euclidean 
patches. This leads to the concept of the Lebesgue measure 
for the phase space as a whole. For a system with N num- 
ber of particles with position vectors rj,r2,---,ry and corre- 
sponding momentum vectors pj, p2,-:: ,py, we denote the 3N 
number of components of all the position vectors collectively by 
the symbol Q while the components of the momentum vectors 
are similarly collected in P.The infinitesimal volume element 
of a 6N-dimensional parallelepiped is dQdP = [[,_, dr,d py, 
which specifies the Euclidean volume of the parallelepiped, and 


the associated Lebesgue measure: pu,(dQdP) = dQdP ( also de- 
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noted by pi, (dX), where X is used to abbreviate Q,P). Other, 
more general, measures functions can also be introduced on 
subsets of the entire phase space, where a measure is additive 
on the subsets. For a measure jz, the measure of an infinites- 
imal parallelepiped with Lebesgue measure dX is p(dX), with 
respect to which the integral of a function F(X) is denoted 
by [{u(dX)F(X). For instance, any normalized weight func- 
tion (which can be looked upon as a probability distribution) 
p(X) defined over the phase space corresponds to a measure 
jtp(dX) = p(X)dX, and the integral of a function F(X) with re- 
spect to this measure is [ ,(dX)F(X) = J p(X)F(X)dX, which 
is nothing but the expectation value of F' with respect to the 
probability distribution ». Here the integral is over the entire 
phase space. The measure of any subset (say, A) of the phase 


space with respect to jy: is given by [, (dX). 


The Lebesgue measure in the phase space of a Hamiltonian sys- 
tem is also referred to as the Liouville measure since it remains 
invariant under the time evolution described by the Liouville 
equation. For a system whose time evolution is described in 
terms of a set of differential equations that may differ from the 
Liouville equation, an invariant measure (i.e., one that remains 
stationary in the time evolution) may exist that describes the 
asymptotic behavior of a large set of probability distributions in 
the phase space; this is then referred to as a ‘natural measure’. 


In the case of the Liouville equation describing the time evolu- 
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tion of an isolated system, the Lebesgue measure itself is the 
natural measure, though an initial measure defined in terms of 
a probability distribution p tends asymptotically to this natural 
measure only under appropriate ‘coarse graining’. Equilibrium 
statistical mechanics rests on the foundation that it is the Li- 
ouville measure in the energy shell that corresponds to the mi- 
crocanonical ensemble. More precisely, for an ideally isolated 
system with energy FE, the normalized natural measure is given 


by 


dw 
VHP. 


[ (dw) = dw? (9-2) 
J Wwitea 
where dw denotes the Lebesgue measure on the energy surface, 


and the integral in the denominator is taken over the entire 


energy surface. 


For an open system, on the other hand, the question of a nat- 
ural measure which, generally speaking, is singular in nature, 
is not a settled one. For the steady states of a certain class of 
open systems (ones that can be described in terms of appropri- 
ate thermostats), the SRB distribution (sec. 9.10) has proved to 


be of great value. 


In this book, a number of mathematical notions are made use 


of in a somewhat loose manner, without detailed and precise 
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explanation. For mathematically precise and authentic state- 
ments on a number of topics pertaining to the present chapter, 


refer to [125] and references therein. 


1. It may be mentioned that, while evaluating an integral in the notional 


gamma space (I), one has to bring in the pre-factor wisw as explained 


in section 1.3.2.2. The pre-factor gets canceled if the normalized mea- 


sure is referred to. 


2. As stated above, a probability distribution p(X) gives rise to a measure 
Htp(dX) = p(X)dX, while a measure may be of a singular nature without 
a physically meaningful probability distribution being associated with 


it. 


The ergodic hypothesis can now be interpreted as implying that in the 
long run, all the phase cells become equally probable regardless of the 
initial state of the system, where the probability distribution over the 
cells is defined in terms of the relative frequencies with which these are 
visited by the representative point, and where the hypothesis implies that 


the cells are all of equal natural measure. 


Equivalently, starting from any initial probability distribution (g9(X)) in 
the phase space (from now on we refer to the energy shell rather than the 
energy surface where the Lebesgue measure defines the natural measure, 
subject to normalization), the following situation obtains except when 
po(X) belongs to a special class of distributions (whose specification we 


do not enter into). 


At any given time t, the initial probability distribution p,(X) evolves into a 
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distribution p;(X) in virtue of the Liouville equation (refer to section 8.3.3), 
which determines the evolution of a mixed state, based on the Hamilto- 
nian equation for pure states. Then, referring to a long stretch of time 


T(— oo), the time-averaged probability distribution is given by 
. | 
p(X) = lim — | p,(X)dt. (9-3) 
0 


Boltzmann’s ergodic hypothesis, stated in terms of the phase cells, can 
then be interpreted in terms of the evolving probability distribution by 
saying that, for tT —> oo, p(X) is uniform over the entire energy shell, 
thereby specifying the microcanonical distribution for a system with a 


specified energy. 


It is important to note that the above statement based on the ergodic hy- 
pothesis does not tell us that the initial distribution p)(X) evolves in the 
long run to the limiting distribution p(X). Such an evolution towards a 
limiting distribution requires additional assumptions not inherent in the 
ergodic hypothesis. Formula (9-3) only makes a statement about the long 
time average of the distribution function, that relates to the frequency of 
visits to the phase cells referred to by Boltzmann, once again taken over 
a long stretch of time. The correspondence between the long time average 
of the relative frequency of visits to the phase cells and the time-averaged 
distribution function constitutes the interpretation of the picture based 
on the phase cells in terms of probability densities in the phase space. It 
is the long time average that corresponds to thermodynamic equilibrium, 
and one can interpret the ergodic hypothesis as implying that the ther- 


modynamic equilibrium is described by a uniform relative frequency for 
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all the phase cells (relative to their natural measure) or, equivalently, by 
a uniform probability distribution over the accessible part of the phase 


space. 


A probability distribution in the phase space is, however, not of direct rel- 
evance in terms of macroscopic observations made on the system under 
consideration. What the observations relate to are the values of ther- 
modynamic functions characterizing the system. From the point of view 
of statistical mechanics, a thermodynamic function corresponds to some 
function, say, F(X) = F(Q,P) defined over the phase space where, gen- 
erally speaking, the function is to be of a certain type: the value of the 
function at any given instant of time depends on the position and mo- 
mentum variables of a large number (kK ~ N) of particles that the system 
is made up of, where the contributions of the various different particles 
are to be of similar orders of magnitude (for instance, the total kinetic 
energy of the system, is of such a type). The equilibrium value of the ther- 
modynamic function (at times referred to as a ‘thermodynamic variable’, 
a ‘thermodynamic parameter’, or a ‘state variable’) is interpreted in either 


of two equivalent ways. 


In one of these, we refer to an initial phase space point Xo, at which 
the function F' has value F'(X)). From the point of view of phase cells, 
this corresponds to the value of the function corresponding to the cell to 
which the initial point belongs. In the course of time, the phase point X 
describes a trajectory in the phase space, passing through a succession 
of cells and, depending on the cell to which X; (the phase point at time #) 


belongs, the value of the function at time t would be F'(X,). Starting from 
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this microscopic definition of the instantaneous value, the equilibrium 
value, interpreted as the long time average of instantaneous values, is 


given by 


fines | od (9-4) 


The other interpretation of the equilibrium value of F' is in terms of the 
probability distribution in the phase space. Starting from an initial distri- 
bution po(X), one obtains a distribution p,;(X) in virtue of the time evolu- 
tion in accordance with the Liouville equation. The value of the function 


F at time t would then be given by 
(Fle= [ pl X)FOOAX, (9-5a) 


and the equilibrium value, interpreted as the long time average, by 


Fain] Pa (9-5b) 


TO T 0 
which, according to (9-5a), is given by 


Pet ae / AX p(X) F(X). (9-5c) 
0 


T—00 T 


1. From the physical point of view, the association of thermodynamic 
equilibrium to long time averages is based on the observation that, 
given a sufficiently long lapse of time, an isolated thermodynamic sys- 
tem is found to go over to an equilibrium configuration that remains 


invariant in time. Thus, if one evaluates the time-averages of proba- 
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bilities associated with various regions in the phase space then these 
represent the equilibrium probabilities since any initial period during 
which the system remains out of equilibrium would not matter in the 
time averages. A similar reasoning tells us that long time averages 
of measured values of the thermodynamic functions represent their 


equilibrium values. 


2. According to formula (9-5a), the measured value at any given time 
instant of the thermodynamic function F is represented by its expec- 
tation value evaluated in terms of the probability distribution over the 
phase space at that instant. Strictly speaking, this requires that the 
fluctuations represented by the variance are small; this can indeed 
shown to be the case for a system made up of a very large number of 
particles, provided the distribution is close to the equilibrium distri- 
bution, as we see below. Since, in the long time average, the initial 
non-equilibrium situation does not count, the representation in terms 


of expectation values is justified. 


3. The ergodic hypothesis implies that any initial microstate evolves in 
time in such a way that the representative point ‘explores’ the entire 
energy shell (or, more precisely, the energy surface) uniformly. In the 
context of the ergodic theorem to be outlined in sec. 9.2.3 below, this 
requires the assumption that the system is metrically transitive. Boltz- 
mann made this assumption implicitly, feeling intuitively that, gener- 
ally speaking, it is satisfied for a system made up of an enormously 


large number of particles. 


9.2.3 The ergodic theorem 


Boltzmann’s intuitive ideas in proposing the ergodic hypothesis were jus- 
tified to a large extent some fifty years later independently (and almost 


simultaneously) by Birkhoff and Neumann who laid the foundation for 
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ergodic theory by way of proving the ergodic theorem (their versions dif- 


fered in technical details). 


Pruned of the technical details, the content of the ergodic theorem can be 


outlined as follows (see, for instance, [103], [108], [141]). 


To start with, one observes that the Liouville equation specifies, for any 
given time t, an invertible transformation of the phase space in that, given 
any point Xo, it specifies a transformed point X; (the one to which Xo 


evolves in time ¢) that can be symbolically represented in the form 


X, = U,Xo, (9-6) 


where U; is referred to as the evolution operator. As mentioned previously, 
U, is a measure-preserving transformation in that the Liouville measure 
remains invariant under the transformation. Finally, looking at the family 


of transformations U; for all possible values of t, these possess the group 


property 


Uy U; a Ose, (9-7) 


Uo being the identity element of the group (for any given t, U_; is the 


inverse of U;). 


Two crucial assumptions on which the theorem is based are those of 
metrical transitivity and finiteness of the natural measure of the energy 
surface (or of the energy shell of vanishingly small thickness). The prop- 


erty of metrical transitivity asserts that there is no proper subset of the 
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energy shell of a non-zero measure that is left invariant by the transfor- 
mation U; for any t 4 0. An equivalent way of stating the same thing 
is that the energy is the only non-trivial invariant of motion under the 


Liouville evolution. 


The theorem itself can be stated in two parts. First, there comes the 
assertion that for any integrable function F(X) the time average given by 
the right hand side of (9-4) exists for all X) (on which the time average 
apparently depends in virtue of the evolution X; = U;Xo) except possibly 
for some set of measure zero. The theorem then goes on to assert that 
this time average is also the phase space average of F(X) with respect to 
the natural measure (i.e., the Liouville measure in the phase space in the 
present context of an isolated system), where we assume the measure to 
be normalized. In other words, 

r Sts F(X)dX 


7 1 
F=lim— | F(U,X)dt= 


TIO T Jo See dX ’ (9-8a) 


i.e., 
F= (F’) m (say), (9-8b) 


for all Xo except possibly for some set of measure zero, where the inte- 
grals on the right hand side are over the entire energy shell, denoted I'z 
(the energy range dF does not matter much in specifying the energy shell 
so long as it is small enough, when JN is sufficiently large). This implies, 
in particular, that the time-average on the left hand side is actually inde- 


pendent of Xj. The equality with the phase space average then tells us 
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that the equilibrium value of a thermodynamic function is given by the 


microcanonical ensemble introduced in section 2.2.1. 


From the practical point of view, one would like to know how the phase av- 
erage on the right hand side is approached as the interval 7 occurring on 
the left hand side is made progressively large. More pertinently, starting 
from an initial probability distribution po(X) in the phase space, one can 
work out the phase space average of F' with respect to the distribution p,;(X) 
resulting from po(X) by the Liouville evolution, and then inquire whether 
and how this evolved average approaches the equilibrium value represented 
by either side in (9-8a). The ergodic theorem does not answer this ques- 
tion either. Finally, the ergodic theorem, which justifies the microcanonical 
ensemble on the basis of the requirement of metrical transitivity, does not 
give an operational criterion for deciding whether the time evolution of any 
given system does actually satisfy this criterion. Put differently, the ques- 
tion of whether the systems of interest in statistical mechanics are ergodic, 


remains open. 


In continuation of this brief outline of the ergodic theorem, let us re- 
fer to (9-5c), which gives an alternative interpretation of the equilibrium 
value of a thermodynamic variable F', based on equations (9-5a), (9-5b), 
and check its consistency with (9-Sa). To start with, we note that the 
Liouville equation can be expressed in terms of the evolution operator U; 


(refer to (9-6)) as 


p(X) = po(ULX), (9-9) 
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(check this out, referring back to sec. 8.3.3). This implies that the equi- 
librium value of Ff’, as expressed by the right hand side of (9-5c), can be 


written in the form 
F= fi dX F(x)p(x), (9-10) 
Te 


where p(X) stands for the time-averaged probability density given by (9-3) 
(check this out; make use of the time-reversibility of the Liouville evolu- 
tion). Since the probability distribution is an integrable function, the 
ergodic theorem implies that the time-averaged probability density is ac- 
tually independent of X for a metrically transitive system and therefore 


corresponds precisely to the microcanonical ensemble: 


1 
max" (9-11) 


DI 
Ls 


consistent with (9-8a). It is an invariant distribution - the only possi- 
ble one, under the given constraints on the system (corresponding to 
specified values of E,V,N in the case of a simple fluid) — describing the 


equilibrium configuration under these constraints. 


What the ergodic theorem does is to relate the equilibrium properties of 
an ergodic system (i.e., a metrically transitive one) with time averages. 
It does not tell us whether and how the equilibrium is approached at 
large times, for which it is necessary to assume additional features of the 


microscopic dynamics of the system under consideration. 


However, the theorem allows us to work out equilibrium properties for an 
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arbitrarily specified initial probability distribution when the latter is not 
confined to any specified energy surface (refer to [108]). The basic idea 
is to look at the total probability (i.e., unity) as the sum of contributions 
from all possible energy surfaces from E — —oo to E — oo though, in 
reality, the energy is bounded below, and the contributions from energies 


in the range from —oo to the lower bound add up to zero. 


Thus, starting from an initial probability distribution p9(X) not necessar- 
ily confined to an energy surface, one obtains a distribution p;(X ) in virtue 
of the Liouville evolution, in terms which the time averaged distribution 
p(X) is obtained from (9-3) where, now, p(X) has contributions from all 
the energy surfaces, and varies from one surface to another since the 
time evolution is not metrically transitive over the entire phase space I: 
the latter decomposes into disjoint parts (the various energy surfaces) 
over each of which metrical transitivity holds. Making use of the ergodic- 
ity over each energy surface, one can define a (time-invariant) probability 


distribution over the various possible values of the energy FE as 
p(B) = f p(X)s(B — H(x)ax = f ps HCX))aX, (9-12) 
r T 


(reason this out; recall that p(X) is now distributed over the various dif- 


ferent energy surfaces). 


Given a phase space function F’, its equilibrium value 


(Faq = f PAX de (9-138) 
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(where I repeat that the time-averaged distribution p(X) on the right hand 
side is not confined to an energy shell in the present context) can be 
expressed in terms of the microcanonical equilibrium values (F’),, given 
by (9-8a) (the sub-index F is attached to indicate that a specific energy 
surface is being referred to out of a family of such surfaces), integrated 


over the probability distribution p(F) of (9-12): 
(Eee = [Ere Fe (9-13b) 


This tells us that in the case of an ergodic system, the equilibrium en- 
semble p corresponding to an arbitrarily specified initial probability dis- 
tribution (i.e., one not necessarily confined to an energy surface) can be 
looked upon as a mixture of microcanonical ensembles of various possi- 


ble energy surfaces, in which the probability density at energy F is p(£). 


9.2.4 Macroscopic states: approach to equilibrium 


The basic idea in statistical mechanics is to explain the properties of 
equilibrium and non-equilibrium states of a macroscopic system in terms 
of its microscopic dynamics, i.e., the time-evolution of its microscopic 
states. The latter, ideally, are pure states of the system though, in reality, 


these may be assumed to correspond to small cells in the phase space. 


However, as mentioned earlier, the equilibrium and non-equilibrium states 
are macroscopic ones, defined operationally in terms of values of a num- 
ber of phase space functions of special nature. When one speaks of the 


state of a system evolving towards equilibrium, it is actually the evolution 
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of a macroscopic non-equilibrium state that is being spoken of. A distri- 
bution function in itself does not distinguish between microscopic and 
macroscopic states (or between ‘microstates’ and ‘macrostates’ as these 
are referred to). The former correspond to pure states represented by 
points in the phase space or, in practice, by phase space distributions 
concentrated over small regions (the phase cells) in the phase space. The 
latter, on the other hand, are described in terms of averages with respect 
to phase space distributions that may, in general, be time dependent. As 
outlined in sec. 9.2.3, (see also 8.4.2.1) an initial probability distribution 
po(X) evolves to a distribution, say, p;(X) at time t in accordance with the 
Liouville equation. In the following we will assume that the initial phase 
space distribution po(X) is concentrated over a microscopic region of the 


phase space. 


For the sake of clarity, we consider phase space functions grouped into 
three types: a phase space function F of the first type is assumed to 
depend on only a few of the 6N number of phase space co-ordinates (i.e., 
components of the position vectors and momenta of the N number of 
particles constituting the system under consideration; we denote these 
phase space co-ordinates collectively by X) — we refer to it as a microscopic 
phase space function. A function of the second, or intermediate, type is 
assumed to depend on a substantial number, of the phase space co- 
ordinates but that number is still small compared to N. Finally, the 
third type of functions, referred to as thermodynamic ones, depend non- 
trivially on a large number of phase space co-ordinates, comparable to 


N. As ina substantial number of earlier sections, we refer for the sake 
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of concreteness to a one-component fluid with given values of energy F, 
number of particles V, and volume V as our system of interest. These will 
be termed the thermodynamic constraints on the system in the context 


of its time-evolution. 


Corresponding to the above schematic classification of the phase space 
functions F(X), the time-evolution will also be described by referring to 
three regimes, once again in a schematic way. The first of these lasts 
for a short span of time, of the order of the typical collision time (7,) of 
the molecules in the system. At the end of this regime, the expectation 
values of the functions of the first of the above three types all decay to 
nearly constant values. However, the fluctuations of these functions are, 


in general, large, i.e., comparable to the averages themselves. 


The second, intermediate, regime in the time-evolution extends roughly 
from 7, to some time rl large compared to 7, but still small compared to 
the relaxation time Tz, the latter being the time when the system attains 
equilibrium under the given constraints. In the time span from 7, to 7, 
the expectation values of the phase space functions of the intermediate 
type mentioned above all decay to approximately constant values. Ad- 
ditionally, the fluctuations of these functions remain small compared to 


the respective average, but are still not negligible. 


The third of the three regimes will be referred to as the thermodynamic 
regime and is the one considered in sections 8.2 and 8.5 (refer back to 
section 8.8 too). In this regime, only the thermodynamic functions in 


the phase space remain to attain time-independent values, and is domi- 
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nated by the hydrodynamic modes referred to in sec. 8.2.3. At any time 
t during the span of this regime (roughly, from 7 to 7,) the distribu- 
tion function p;(X) is said to describe a non-equilibrium macroscopic (or 
thermodynamic) state. The thermodynamic functions correspond to den- 
sities of the conserved quantities characterizing the system (£ an N for 
a one-component non-magnetic fluid) and may more generally include 
densities of order parameters characterizing broken symmetries in phase 
transitions. These are space-time dependent functions of the phase space 
variables and their expectation values (with reference to p,) satisfy partial 
differential equations involving a number of kinetic coefficients that can 
be determined within the framework of linear response theory in terms of 
integrals of equilibrium time correlation functions of appropriate currents 


(the Kubo formulae). 


In the long run, over a time span of the order of 7g, the non-equilibrium 
macroscopic state described by »; goes over to the equilibrium state cor- 
responding to the given constraints, when the expectation values of the 
thermodynamic functions decay to time-independent values. What is 


more, these functions are characterized by vanishingly small fluctuations 


he, 
VN 


their respective mean values. The equilibrium state corresponds to the 


(AF = ((F?) — (F)2)? for a phase space function F), typically ~ times 
microcanonical distribution f in the phase space, but this does not mean 
that p,(X) tends to p(X) at every point X in the phase space. On the other 
hand, as we saw in sec. 9.2.3, p equals the long time average of p;, and 
at large t, p; covers the phase space more and more densely. In other 


words, the equilibrium state is described by the microcanonical distribu- 
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tion function in the sense of a coarse-graining. 


All these considerations apply only to systems with a very large num- 
ber (N) of particles where, strictly speaking, one has to go over to the 
thermodynamic limit for the distinction between the microscopic and the 
macroscopic modes of description to make sense. The schematic pic- 
ture outlined above is intended to convey an intuitive feeling of the way 
the microscopic and macroscopic processes belong to clearly separated 
regimes. Non-equilibrium and equilibrium thermodynamic theory ap- 
plies to the near-equilibrium regime (7 < t < 7), and is explained in 
terms of the linear response theory (refer to section 8.5). The concept 
of entropy of a system is clearly defined in this regime in terms of the 
principle of local equilibrium and, more importantly, the time evolution 
can be characterized in terms of the entropy production which is positive 
definite in conformity with the second law of thermodynamics provided 


the eigenvalues of the matrix of kinetic coefficients are all positive. 


Indeed, the matrix of kinetic coefficients has to be positive if the second law 
of thermodynamics is to hold, and experimental determinations of the ki- 
netic coefficients bear this out. The eigenvalues worked out on the basis of 
theoretical and numerical calculation schemes also amply corroborate this, 
so much so that a scheme in which the positive definiteness is not obtained 
as a consequence, is marked as suspect. As pointed out several times in 
earlier sections, the second law is accommodated into the equilibrium en- 
sembles of statistical mechanics in the form of variational principles of the 


relevant thermodynamic potentials, while the approach to equilibrium in 
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the linear response theory is described in terms of fluctuations based on 


one or other of these equilibrium ensembles. 


The ‘intermediate regime’ (7. < t < 7g) has been included in the above 
scheme empirically and in a vague manner, without a firm theoretical 
basis. It does not fall under the purview of thermodynamic theory and 
is also beyond the scope of the linear response theory. The principle of 
local equilibrium does not apply to far-from-equilibrium situations, and 
there is no acceptable definition of entropy of a state described by the 
time-dependent probability distribution p;. On the other hand, it appears 
that the concept of a positive definite entropy production still remains a 
valid one, thereby indicating that the idea of irreversibility continues to 
be relevant for a system made up of an enormously large number of par- 
ticles. In particular, a number of concrete results have been established 


in respect of far-from-equilibrium steady states, as we will see below. 


Having outlined the notion of a macroscopic state that can be either a 
non-equilibrium or an equilibrium one, we now briefly look at the ques- 
tion as to how the evolution from a non-equilibrium to an equilibrium 
state in the case of an isolated system is correlated with its microscopic 
dynamics. This relates to the widely discussed irreversibility problem: 
how can one reconcile the reversible microscopic dynamics with the ir- 
reversibile evolution of the macroscopic state of an isolated system? Put 
differently, is the Liouville dynamics describing the time evolution of an 


ensemble consistent with the second law? 


As we have indicated above, the ergodic theory that provides the basis of 
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the equilibrium ensembles of statistical mechanics, does not answer this 
since it establishes a connection between time averages of observables 
and the equilibrium distribution, but does not describe the time-evolution 
towards the latter. As we have seen, the linear response theory, which 
assumes Liouvillean evolution close to an equlibrium state explains a 
lot of things and provides us with an interpretation of non-equilibrium 
thermodynamics, but still falls short of answering this question, where 
one needs to make additional assumptions that imply, in combination 
with the fluctuation-dissipation theorem, an irreversible approach to- 
wards equilibrium. These additional assumptions, in the context of non- 
equilibrium thermodynamics and of the linear response theory, relate to 
the signs of the kinetic constants (i.e., more precisely, of the eigenvalues 
of the matrix L with the kinetic coefficients L;; as its elements; refer to 
sections 8.5 and 8.4.11.3) or equivalently, of the transport coefficients. 
For instance, the linear response theory, in deriving the hydrodynamic 
equations for a simple fluid, does not make any statement regarding the 
signs of the transport coefficients K,7,¢, and one has to assume that 
these are to be positive in order to establish the irreversible evolution to 
the equilibrium configuration. As mentioned earlier, experimental deter- 
minations and theoretical computations are, however, all in conformity 


with the positive definiteness of the kinetic and transport coefficients. 


It is in this background that one inquires as to what additional general 
feature, if any, of microscopic dynamics ensures an irreversible approach 
to equilibrium. Gibbs pointed to precisely such a feature of the micro- 


scopic dynamics, namely, mixing that we will turn to in sec. 9.4 below. 
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In sec. 9.3 below we introduce the Frobenius-Perron and Koopman oper- 
ators in the context of Liouville dynamics, in terms of which the problem 
of the irreversible approach to equilibrium can be conveniently posed in 
mathematical terms. We then explain the idea of mixing in sec. 9.4 where 
we point out that mixing provides an answer to the irreversibility question 
in a limited and partial sense: the Liouvillean evolution of a probability 
distribution for a system endowed with the property of mixing ensures 
its approach to the microcanonical ensemble in the asymptotic regime 


(t ~ oo), and that too ony in the so-called weak sense. 


On the other hand, the question as to whether the systems of interest 
in statistical mechanics, are actually endowed with the mixing property, 


remains as one with no satisfactory answer. 


9.3 The Frobenius-Perron and Koopman oper- 
ators 


Recall that a pure state evolves in accordance with the Hamiltonian equa- 
tions of motion while a mixed state, described by a probability density in 
the phase space, evolves by the Liouville equation (8-91a). On integration 
of the Hamiltonian equations, the evolution of a phase point X repre- 
senting the pure state is formally expressed in terms of the operator U, 
as in (9-6). The integral form of the Liouville equation can likewise be 


expressed formally in terms of the Frobenius-Perron operator F;: 
pi=Fipo, Fr=e™, (9-14) 
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where L = —iL is referred to as the Liouville operator, introduced in 
sec. 8.4.2.1. Finally, the time evolution of an observable quantity, such 
as a thermodynamic variable, defined by the phase space function F(X) 
is described in terms of what is referred to as the Koopman operator K;,, 
which turns out to be the Hermitian adjoint of the Frobenius-Perron op- 


erator (eq. (9-18) below). 


1. All these definitions need some clarification, for which we recall the the 
setting in sec. 8.4.2.1 along with the results stated there, and express 
those in the present notation. The Liouville evolution of p(X), where 
the latter provides the general description of the state of a system, 
is of the Schrédinger type (in analogy with the Schrédinger picture 
of quantum mechanics) while the evolution of an observable quantity 
described by a phase space function F(X) (such phase space func- 
tions have been considered in sec. 9.2.4 in order to define macro- 
scopic states of a system) is of the Heisenberg type. The Frobenius- 
Perron operator and the Koopman operator both act on phase-space 
functions that can be looked upon as vectors in a Hilbert space, as 
indicated in section 8.4.2.1, in which the scalar product of two func- 
tions (¢(X),~(X)) is given by (refer to eq. (8-92), where the notation is 
slightly different) 


Coe / dX6(X)"U(X). (9-15) 


Here the complex conjugation is used formally for the sake of gener- 
ality, and will not be needed since we will consider only real-valued 


functions. 


Strictly speaking, the vector space of probability distributions such as 
p(X) is to be distinguished from that of thermodynamic functions such 


as F(X), one being the dual of the other. The inner product of p(X) 
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and F(X) gives the expectation value of F(X) for the state p(X): 
(Fp) = f aXF(X)p(X) = (Po (9-16) 
where, at times, the sub-index p on the right hand side is suppressed. 


With a given probability distribution p9(X) at the initial time t = 0, one 
obtains, in the Schrédinger type description, the distribution p,; at time 
t as in (9-14). In this description, the observable F(X) does not evolve 
in time, and one obtains the expectation value of F' at time t¢ as (refer 


back to formula (9-5a)) 


(Pen / dX F(X)pi(X) = i dX F(X)Fipo(X) = i dX F(X)e7*£* p(X). 


(9-17a) 


On the other hand, in the Heisenberg type description, the probability 
distribution does not evolve and continues to be given by fo, while the 
observable F' evolves as in (8-96) which is expressed in the present 


notation as 
F(X,t) = F(U,X,0) = K, F(X), (9-17b) 
where K is referred to as the Koopman operator. One thereby obtains 
(PF), = J AXKiFX) 001%). (9-17c) 


Comparing (9-17a) and (9-17c), and recalling that, as mentioned in 
sec. 8.4.2.1, £ is a Hermitian operator in the Hilbert space of functions 
(subject to appropriate integrability and boundary conditions) on the 


phase space, one arrives at the conclusion 


K,=e, (9-18) 
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i.e., the Koopman operator K; is nothing but the adjoint of the Frobenius- 


Perron operator 7;. This, essentially, is the result stated in (8-99). 


2. von Neumann and Koopman, along with others, set out to formulate 
the classical and the quantum mechanical theories along formally sim- 
ilar lines, which is why they adopted the representation of the states 
of a classical system by means of the probability densities in the phase 
space, which were complemented with the observable quantities of the 
system, represented by phase space functions. This was analogous 
to the description of the states of a quantum mechanical system in 
terms of the density operators, complemented with the set of opera- 
tors representing the observable quantities (the observables form an 
algebra that could be taken as the basic entity describing an infinitely 
large system; this approach, however, lies outside the scope of the 
present book). Once the classical and quantum mechanical theories 
were formulated in analogous terms, Neumann set out to place quan- 
tum statistical mechanics on a rigorous foundation by formulating the 
quantum mechanical analog of the classical ergodic theorem estab- 
lished by him and, independently, by Birkhoff. The quantum ergodic 


theorem will be referred to in chapter 10. 


9.4 The approach to equilibrium: mixing 


As mentioned earlier, the ergodic hypothesis and the ergodic theorem 
establish a relation between the equilibrium configuration and time aver- 
ages of the relevant phase space functions, based on the strength of the 
observation that an isolated system left to itself does approach a unique 
state of equilibrium after a sufficiently long time. It does not tell us why 
and how the equilibrium is approached, starting from an arbitrarily spec- 


ified initial configuration, when a set of exceptional configurations is ex- 
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cluded. 


Given any observable quantity described by the phase space function 
F(X), and an initial distribution po(X), we ask the question as to whether 


the following limit is realized 


lim (F), = Jim / p(X) F(X)dX = il p(X) F(X)dX, (9-19a) 


i.e., whether 


jim [eXolX)FOO) = [cE Fe. (9-19b) 
where we have made use of (9- 13a), (9-13b). 


The integrals here are over the entire phase space I’, and the probability 
distributions po(X), p:(X), p(X) (the latter defined as the long time average of 
p:(X)) are all considered as mixtures of ensembles corresponding to energy 


surfaces of various possible energies E, as indicated in sec. 9.2.3. 


The integral [ dX p,(X)F(X) over the phase space can be broken up as an 


integral over the various possible energy surfaces as 


[exer = fae [6B - HX) XR A)dr = f aBOE)(o(X) FOO) 


(9-20a) 


where 


Q(B) = / 5(E — H(X))dX, (9-20b) 
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is the (6N — 1)-dimensional volume of the energy surface at energy E, 
and is referred to as the structure factor. In (9-20a), (p,(X)F(X))2 stands 
for the microcanonical average at energy FE of the phase space function 
pi(X)F(X) (refer to (9-8a), (9-8b); we replace the integrals over the energy 
shell by ones over the phase space, constrained by the delta function). We 
now refer to the probability density in energy, p(/), defined in (9-12) and 
make use of the fact the right hand side of the first equality in it is actu- 
ally independent of t, in virtue of which we can now re-frame the above 
question by asking whether the limit on the left hand side of the following 
equation exists and equals the right hand side regardless of the choice of 
po(X) and F(X) (subject to integrability and, for po9(X), normalization; one 


also needs to exclude a set of initial distributions of exceptional nature): 
fim f aBQCE\(o(X)FX))e = f aBQE)oo(X))(P(X))o- — (@-21a) 


The answer to this question is in the affirmative, i.e., the long time limit of 
(F), does approach (F’)., if, for arbitrarily chosen phase space functions 


A(X), B(X) (subject to appropriate integrability conditions), one has 
lim (A((X)B(X))p = (A(X))2(B(X))e. (9-21b) 


where A;(X) = A(U;X) (refer to eq. (8-96), where the notation is slightly 
different). 


In other words, if the condition (9-2 1b) is satisfied for all observable quan- 
tities A,B for an isolated system with a specified energy FE then, for an 


arbitrarily chosen normalized probability distribution pp(X) on the en- 
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ergy surface (excluding a set of exceptional distributions) the expectation 
value of any observable at time ¢ under the Liouville-evolved distribution 


p, tends, at t — ov, to the microcanonical expectation value at energy EF. 


Assuming that the dynamical equations of the system are sufficiently 
smooth, one can extend to an initial distribution pp spanning a family of 
energy surfaces in the phase space in which case, if the condition (9-2 1b) 
is satisfied for all A,B and all E compatible with the Hamiltonian, then 
the expectation value of an observable F’, (F’);, at any time t tends to (F).q 


ast- o. 


The condition (9-21b) tells us that the time correlation (A;B) — (A)(B) 
tends to zero as t > oo. The dynamics of a system satisfying this condition 


is said to be of the mixing type [108], [29]. 


Mixing is a feature of the microscopic dynamics of the system under con- 
sideration, but it points to a macroscopic feature of great relevance — the 
one that tells us that the isolated system, left to itself, does actually evolve 
towards equilibrium described by the microcanonical ensemble. Recall 
that such a conclusion could not be arrived at on the basis of the prop- 
erty of ergodicity which is also a characteristic feature of the microscopic 
dynamics with implications about the macroscopic evolution, but one 
that only relates the time averages of observables with their phase space 
averages specified by the microcanonical ensemble. It may be mentioned 
in this context that the feature of mixing of a dynamical system implies 


ergodicity, though the converse is not true ( [29], chapter 5). 
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As we will see in later sections, features of microscopic dynamics ac- 
quire relevance in the context of open systems as well, where the phase 
space distributions characterizing macroscopic steady states - even ones 
far from equilibrium - are related to a number of general features of the 
microscopic dynamics, it being the case that these phase space distribu- 
tions are stationary (under the microscopic dynamics) and asymptotic (in 
the sense of t — oo), i.e., correspond to natural measures in the phase 


space. 


However, the mixing property, which specifies the asymptotic behavior 
of expectation values of observables, does not tell us that the distribu- 
tion p; actually evolves to p, the microcanonical distribution (or a mixture 
of microcanonical distributions with probability density p(£) introduced 
earlier (refer back to (9-12)), as the case may be) in a smooth manner. To 
see how an initial distribution p) actually evolves with time (recall that 
this evolution occurs in accordance with the Liouville equation and is 
described by the action of the Frobenius-Perron operator) we look at an- 
other equivalent criterion defining a mixing system, by referring to two 
subsets, A and B, of non-zero Liouville measure on the energy surface at 
energy F (fig. 9-3). Each point of A, looked at as a pure state, evolves in 
time and, at time t, the points of the set A get distributed over a set A, 
whose Liouville measure remains the same as that of A while its shape 
gets changed, with A; intersecting B in a set A,B. For a mixing system, 
the set A; becomes filamentous with the passage of time, criss-crossing 


B a large number of times, and the following criterion may be taken as a 
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defining criterion of mixing: 


sy MALOB) _ (A) 
ie ae (9-22) 


where ,, stands for the Liouville measure on the energy surface(ju(dw) = 
WH here dw defines the elementary surface area) which need not be 
normalized since the normalization gets canceled in (9-22). In the above 


formula, jz stands for the Liouville measure of the energy surface. 


The requirement (9-22) is said to define what is referred to as strong mixing; 
in weak mixing, this condition is required to be satisfied only in an average 


sense. 


stretched and folded 
structure resulting 
from time evolution 
of A 


“energy surface 


Figure 9-3: Depicting schematically how, in virtue of the property of mixing, an initial 
set A is distorted into a folded filamentous structure under Hamiltonian time evolution, 
criss-crossing another set B; in the limit of infinitely large time (t — oo) the stretched and 
folded structure resulting from the time evolution covers the energy surface uniformly 
(on an average) but not smoothly; since the Liouville measure of the set A has to remain 
the same under Hamiltonian time evolution, there remain gaps between the folds, be- 
coming finer and finer for increasing ¢; however, the expectation value of a macroscopic 
observable approaches its microcanonical average as ¢t — oo; an initial distribution on 
the energy surface tends towards the microcanonical distribution in the weak sense. 


This criterion (9-22) can be easily seen to be equivalent to (9-21b) when 
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the observables A,B are taken to be the characteristic functions of the 
sets A and B respectively (i.e., A(X) = 1 if X € A, and zero otherwise, and 


similarly for B(X)). 


What this criterion effectively tells us is that any set A must, in the course 
of time, be drawn into an infinitely proliferating filamentous structure 
and be bent many times over (in virtue of the finite extent of the energy 
surface) so that, after a sufficiently long time, the fractional measure of 
its intersection with any set B must be the same as the factional measure 
of the set A itself with the whole of the energy surface. In other words, an 
initial probability distribution in the energy surface, concentrated over 
any given region A, evolves in time by a similar elongation into a bent fil- 
amentous structure (refer to fig. 8-4, which depicts essentially the same 
process) that covers the energy surface more and more uniformly. How- 
ever, since the normalization of the total probability is preserved by the 
Liouville equation, the energy surface is never smoothly covered and, at 
every instant of time, there remain gaps between the bent parts of the 
filamentous structure representing A, in fig. 9-3. In other words, while 
the distribution covers the energy surface more and more densely, there 
remain sharp variations in the distribution over finer and finer scales. 
One expresses this by saying that an initial probability distribution tends 
towards the microcanonical distribution in a weak sense, and not ina 
strong one (refer to [108], chapter 1, [29], chapter 5). In a sense, the set 
A; mentioned above (or, equivalently the phase space distribution) covers 
the energy surface densely at large times analogous to the way the set of 


rational numbers covers the real line. 
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In summary, a system characterized by the property of mixing exhibits 
the following three properties relating to its time evolution: (1) an ini- 
tial probability distribution approaches asymptotically the microcanoni- 
cal distribution in a weak sense, provided a set of exceptional initial dis- 
tributions (for instance, ones concentrated on a set of periodic points) is 
excluded, (2) the expectation value of an observable approaches asymp- 
totically the microcanonical expectation value, and (3) time correlation 


functions of the form 


(A,B) ¢ = (A,B) ~~ (A:)(B), (9-23) 


tend to zero as t > oo (refer to (9-21b); all expectation values are with 


respect to the microcanonical distribution for some specified energy £). 


Though all these properties implied by the feature of mixing appear to 
explain the behavior of isolated macroscopic systems for which the de- 
scription in terms of equilibrium and non-equilibrium thermodynamics 
applies, it is not clear that mixing by itself answers the fundamental ques- 
tions of statistical mechanics. To start with, it is not known if actual sys- 
tems of interest in statistical mechanics satisfy the mixing criterion, i.e., 
the Liouvillean time evolution of an isolated macroscopic system (such as 
a fluid in a container) is of the mixing type. A number of model systems 
involving hard sphere type interactions have been proved to be mixing 
(and hence, ergodic too), but such proofs do not apply to ones involving 


interactions with a ‘soft’ core. 


Recall that the foundational principle for an isolated system is that its 
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time evolution is governed by the Liouville equation (refer to sec. 9.3). 
Given a system made up of a large number of particles with arbitrarily 
specified interactions among them (subject to appropriate boundedness 
and stability criteria, see sec. 5.2) the question that arises is essentially 
the following: can one formulate some fundamental feature of the mi- 
croscopic dynamics, applicable to all or most of such systems, that ad- 
equately explains the observed features of their macroscopic time evolu- 
tion? Here the term macroscopic is meant to refer to the thermodynamic 
mode of description, i.e., one in terms of a small number of phase space 
functions as compared to the number of the microscopic constituents. 
The question, admittedly, is not a precisely formulated one but it encom- 
passes a number of important observed features of macroscopic time evo- 
lution such as the ones relating to transport and entropy production. The 
property of mixing seems to point in the right direction in that it involves 
two features of crucial relevance in the microscopic dynamics that seem 
to be necessary for thermodynamic behavior: (1) points arbitrarily close 
to one another in the phase space get separated in the course of time as a 
result of which any small set is drawn out into one with a complex shape 
made of long filamentous structures — this is the feature of local instabil- 
ity,and (2) the global feature of folding back of the filamentous structure 


because of the finite extent of the energy surface. 


In the following pages of the present chapter we continue to explore the 
correlation between features of macroscopic time evolution and micro- 


scopic dynamics of systems. 
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9.5 Dynamical systems and chaos 
Mixing implies chaos in a dynamical system. 


The theory of dynamical systems is an extremely rich one where subtle 
technicalities are of importance. It includes the theory of chaos which 
is similarly rich and technically nuanced. Beginning from the nineteen 
sixties, dynamical chaos has bloomed into a subject covering a vast area 


that keeps on growing by the day. 


The theory of chaos in dynamical systems had its beginnings in the works 
of Poincare, Hadamard, and contemporaries, with Poincare’s work being of 
especial significance. The works of Birkhoff, Neumann, Kolmogorov, and 
several others contributed towards the laying of a broader and more solid 
foundation for subsequent developments, which started snowballing from 
the appearance of a paper by Lorenz, based on numerical work. Starting 
with the works of Smale and a few other contemporaries, mathematical 
contributions started pouring in, and continued thereafter. Seminal contri- 
butions in the theory of dynamical systems and chaos came from Arnold, 
Moser, Sinai, Ruelle, Anosov, as also from a host of other mathematicians 


and physicists. 


In the following, we briefly look at a number of aspects of dynamical 
systems and chaos relevant to statistical mechanics. In this we will often 
gloss over technical issues, focusing on physical ideas involved wherever 
possible. For more precise and rigorous statements, refer to [106], [52], 


[25], and [62]. The book by Dorfman [29] is more informal in spirit and is 
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pertinent to our context. 


9.5.1 Dynamical systems: preliminary notions 


For our purpose, a dynamical system will mean a phase space (M, a man- 
ifold such as the one defined in terms of a set of canonical co-ordinates 
of a Hamiltonian system) equipped with a group of transformations and 
a measure (an additive function over subsets in the space), the latter be- 
ing invariant under the transformations. The points in the phase space, 
which is assumed to be a smooth one, describe the possible ‘states’ (more 
precisely, pure states) of the system, and the transformations describe 
the time evolution of states. The group of transformations may be de- 
fined by means of a set of differential equations (such as the Hamiltonian 
equations) or in terms of a mapping (®), the latter being referred to as 
a discrete dynamical system. The mapping is commonly assumed to be 
differentiable, with a differentiable inverse (i.e., a diffeomorphism; an im- 
portant exception is the baker’s map considered below, which is not a 
diffeomorphism). In the case of a dynamical system defined in terms of 
a set of differential equations (also referred to as a ‘flow), the transfor- 
mation in the phase space corresponds to evolution in any chosen finite 
time (refer to (9-6), in which case (9-7) defines the group property). The 
various mathematical objects arising in the theory are assumed to have 
continuity and differentiability properties as required in the relevant con- 
texts. The variables defining any point in the phase space (of dimension, 


say, N) are all assumed to be real. 


The smoothness requirement on the manifold (M) means that overlap func- 
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tions are to be infinitely differentiable, which implies that the Euclidean 
patches that the manifold is made up of are joined smoothly together. Put 
differently, it should be possible to smoothly and invertibly map the mani- 
fold everywhere locally to a Euclidean space, the minimum possible dimen- 


sion of which is the dimension of the manifold. 


While the phase spaces for systems of interest in statistical mechanics 
have enormously large dimensions, systems considered in the theory of 
dynamical chaos are of much lower dimensions, often as low as N = 1, 2,3, 
though many of the theorems hold for arbitrarily chosen values of N. 
In the case of a flow, however, N has to be > 3 for chaos to be possi- 
ble. The baker’s map and the Arnold cat map, to be considered below as 
low-dimensional prototypes of Hamiltonian chaos, are defined for N = 2. 
Given any initial point in the phase space, repeated application of a map- 
ping ® generates an ordered set of points referred to as a trajectory. In 
the case of a flow defined by a set of differential equations, the trajectory 
is a continuous curve in the phase space defined by the solution of the 
differential equations. The set of points constituting a trajectory (without 
reference to the order in which they are traversed) is referred to as an 


orbit. 


In the case of a flow, while the evolution operator for any fixed time t de- 
scribes a map (referred to as a ‘stroboscopic map’), another useful map- 
ping on a surface (of dimension N — 1) that can be associated with the 
flow is obtained as the intersections of trajectories of the flow with the 


surface (which is now the phase space for the map; see [106], [62]), with 
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the trajectories crossing the surface in any specified direction (with ref- 
erence to the two sides of the surface). This is termed the Poincare map 
for the flow, in terms of which numerous characteristics of the latter can 
be conveniently discussed. A periodic orbit of the flow gives rise to a fixed 


point or a periodic orbit (see below) of the Poincare map. 


A fixed point X, in the phase space is one for which 


mapping : ®X9 = Xo, 


flow : U;Xo = Xo (for all t). (9-24) 


A periodic point X, of (least) period p of a map ® is the least positive 
integer p for which X, is a fixed point of ©”, the pth iterate of ©, but not of 
6* for any k < p. There are p number of such points making up a periodic 
orbit of period p. In the case of a flow, a periodic orbit of (least) period T 
is one on which any point X is a fixed point of the evolution operator Ur; 


corresponding to time 7 but not of U; for any t < T. 


Fixed points and periodic orbits are instances of invariant sets of a dy- 
namical system. An invariant set in the phase space is a set of points 


that remain within the set under forward and backward time evolution. 


The forward time evolution consists of applications of the map ©” for all 
integers p > 0, or of U, for all t > 0, as the case may be. The backward time 
evolution is similarly defined, now with negative values of p or ¢ (in the case 


of a map we assume it to be invertible). 
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Also relevant in the context of the dynamics of a flow or a mapping is the 
closed set of its non-wandering points. A non-wandering point X of a flow 
is one such that, given any neighborhood V of X and any T > 0, there 
exists a t > T for which U,Y belongs to V for at least some Y in V. This 
means that points sufficiently close to X return to some neighborhoods of 
it at some time in the future. The open invariant set complementary to a 
non-wandering set is referred to as a wandering set. Wandering and non- 
wandering sets of mappings are analogously defined. Fixed points and 
periodic orbits constitute particular instances of non-wandering sets. A 


non-wandering set may also contain non-periodic orbits. 


9.5.2 Linear stability and Lyapunov exponents 


9.5.2.1 Linearization around fixed points and periodic orbits: brief 


outline 


Given a point X in the phase space of dimension N, one can define a 
tangent space Ty which is a linear vector space of dimension N such that, 
corresponding to a small neighborhood Ny of X in M, one can define a 
small neighborhood in Ty made up of points that are linear approxima- 


tions to those in Nx. 


We first consider the linear stability analysis of a fixed point Xo of a flow 


defined by a set of differential equations of the form 


Lj = Fj(a1,£2,°°- , IN) C= Donte ,N), (9-25a) 


where F; are specified real-valued functions of the phase space co-ordinates. 
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We can write this set of equations in the compact form 
X = F(X), (9-25b) 


where X and F have N components each. The equation determining Xo, 


in this compact notation, is 
F(X) = 0. (9-25c) 


Considering a small deviation ¢(t) (with N number of components ¢;(t) (i = 
1,2,--- ,N)) from Xo, substituting X(t) = Xo + €(t) in (9-25b), and retaining 
only the linear terms in €(t), we obtain the linear approximation to the 


flow around Xo as 


E(t) = Ag(t), (9-25d) 


where the N x N stability matrix (A) is the Jacobian matrix of F' evaluated 


at Xo 


OF 


A= (sy) xo" 


(9-25e) 


The linear stability of X) is determined by the eigenvalues of A, among 
which complex eigenvalues occur in complex conjugate pairs. Let there 
be N, eigenvalues with positive real part, NV, eigenvalues with negative 
real part, and N, eigenvalues with real parts zero (where, evidently, N, + 
N,+ N. = N). Then the tangent space Tx, at Xo can be decomposed as 


the direct sum of three subspaces: a stable subspace of dimension N,, an 
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unstable subspace of dimension N,,, and a center subspace of dimension 
N.. A point in the stable subspace approaches ¢(t) = 0 for t > oo under 
the linear approximation (9-25d) to the flow, while points in the unstable 
subspace move away from € = 0 with increasing time, and those in the 


center subspace oscillate indefinitely. 


Analogous statements can be made in respect of the linear stability of a 
periodic orbit of the flow (9-25a). In this case the linear approximation to 
the flow is of the form (9-25d) where now the elements matrix A are peri- 
odic in t, and the linear stability analysis yields a set of numbers termed 
Floquet exponents. Among these there always exists a zero exponent that 
determines the linear approximation to the flow along the periodic or- 
bit. Referring to the remaining exponents, these can once again be used 
to define the stable, unstable, and center subspaces corresponding re- 
spectively to growing, decaying, and oscillating solutions to the linear 


approximation to the flow around in the periodic orbit. 


The linear stability analysis of a periodic orbit of a flow can be reduced to 
that of a fixed point or a periodic orbit of its Poincare map, with analogous 
results. In the case of a map ®, the linear stability analysis of a periodic 
orbit of period p can be further reduced to that of a fixed point of the pth 
iterate (®”) od the map. The linear approximation to a map around a fixed 
point yields a matrix A whose eigenvalues can be classified into three 
groups: those with absolute value > 1 determine the stable subspace, the 
ones with absolute value < 1 determine the unstable subspace, and those 


with absolute value 1 determine the center subspace. 
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If the flow or the map under consideration depends on one or more pa- 
rameters then there may occur parameter values across which the struc- 
ture of the stable, unstable, and center subspaces gets altered, leading 
to a change in the topological features of orbits close to the fixed point or 
the periodic orbit under consideration. Such alterations are referred to 


as local bifurcations of the flow or the map. 


For our present purpose, more pertinent than the local features close to 
a fixed point or a periodic orbit, are the global features of trajectories in 
the entire phase space, and this requires the idea of Lyapunov indices as 


also of stable and unstable manifolds of trajectories. 


9.5.2.2 Lyapuniv exponents 


The Lyapunov exponents generalize the idea of eigenvalues characterizing 


the linearized system around a fixed point or a periodic orbit. 


Referring to an orbit X(t) in the phase space of a flow, passing through 
some given point X(0) at t = 0, we consider a neighboring orbit X(t) + €(t) 
passing through X(0)+€(0) at t = 0, where &) is assumed to be sufficiently 
small so that X(t) + €(t) can be related to X(t) by a linear approximation. 
Recalling that X(t),€(t) are N-component objects, the linear approxima- 


tion can be expressed mathematically as (see [52], chapter 1), 


AX (t) 
ett) <3 - (0), 
(ie.,) &(t) => ar £0) G2 1,2 , Ni, (9-26a) 
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where 


(9-26b) 


represents the fundamental matrix of the system of differential equations 
defining the flow, constituting the generalization of the matrix e“' (with A 
given by (9-25e) in the context of linearization around a fixed point) and 
satisfies the evolution equation 


OF (X(t) 


M(t, X() = Seq MUX (0). (9-26c) 


The formal solution for the fundamental matrix can be expressed as 


‘OF (X(r)) 


M(t, X(0)) = 7 exp | OX(0) 


dr, (9-26d) 


where 7 indicates a time-ordered product. In order to obtain the funda- 


mental matrix, one needs the solution X(t) for the trajectory in question. 


Noting that X(0) can be any arbitrarily chosen point on the trajectory 
under consideration, one can state in summary that a deviation from the 
trajectory at the point X represented by an infinitesimal tangent vector & 


grows like e” where the exponent \(X, €) is given by 


1 
MX, €) = fim 5 In ||M(t,X) -€ll, (9-27) 
00 
and where || - || stands for the Euclidean distance function in the tangent 


space at X. According to a theorem of Oseledec, depending on the tan- 


gent vector €, \(X,€) can have any value from among a set of numbers 


1259 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


MO (X) (i = 1,2,--» ,r;r <N) referred to as the Lyapunov exponents (at _X), 


ordered as 


NYO) SAPO S ee SAMO), (9-28a) 


where these have associated multiplicities m;(X) such that 


S > mi(X) =N. (9-28b) 
i=1 
There exist nested subspaces S”) (i = 1,2,--- ,r) in the tangent space at 


X such that, for i = 1,2,---,r—1, S“ is a proper subspace of S$”, S$ 
is the tangent space itself, m; = dim(S") — dim(S“”), and m, = dim(S). 
Formula (9-27) gives \ for i = 1,2,--- ,r — 1 if € is chosen in the comple- 
ment of $+) in S$, while \ is obtained by choosing € in S$"). In making 
these statements, the dependence of the Lyapunov exponents on _X is left 
implied. In reality, the Lyapunov exponents do not depend on X (except 
for a set of measure zero) if the invariant measure on the phase space is 
ergodic. In the case of an attractor (see below), the Lyapunov exponents 
are again independent of X (except for a set of measure zero) if the invari- 
ant measure defining the dynamical system is the natural measure on 
the attractor and X belongs to its basin of attraction. In the following, we 
will often leave implied the dependence of the Lyapunov exponents on X. 
Finally, it may be mentioned that Oseledec’s theorem regarding the ex- 
istence of the Lyapunov exponents holds under quite general conditions 


(refer to [62]). 
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Lyapunov exponents for an arbitrarily chosen orbit of a mapping can be 
defined in an analogous manner. Let X be any point belonging to the 
orbit and € denote an infinitesimal deviation from X, represented by a 
vector in the tangent space of X. Then, analogous to the formula (9-27), 


one can write 
1 
A(X, €) = lim = In||D®"(X) -€], (9-29) 


where D&"(X) stands for the derivative of the nth iterate of the map eval- 
uated at X, which can be written, by invoking the chain rule of differen- 
tiation, as a product of the derivatives of ® evaluated at the successive 
iterates of X under ®. The symbol ||-|| stands for the Euclidean norm of a 


vector in the tangent space (refer to [106], chapter 4, for further details). 


Referring back to the Lyapunov exponents of a flow, it may be mentioned 
that one can alternatively express the set of the Lyapunov exponents 
at X by diagonalizing the real positive definite matrix M7(t, X9)M(t, Xo). 
If o,(t, Xo) (¢ = 1,2,--- ,N) be the eigenvalues corresponding to the local 
stretching directions (see below) then the Lyapunov exponents are given 


by 


1 
di (Xo) = kim ot Ino;(t, Xo) (i = 1. 2, te ,N), (9-30) 


Here the indexing of the \’s with the sub-index i differs from the indexing 
in (9-28a) where the \’s were strictly ordered — the set of \,’s, on the other 


hand, may include degeneracies, and contains N number of elements. 
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One can, in this context, define a set of local stretching directions corre- 
sponding to vectors e;(Xo) (¢ = 1,2,--- ,N) (Say) in the tangent space and 
associated stretching factors A;(t, Xo) (refer to [52], chapter 1) that evolve 
exponentially, such that (refer back to eq. (9-27); we replace X,, which 


was chosen arbitrarily, with X) 
1 
A(X) =A Xe X)) = jim 7 mlAd(t, X)I- (9-31) 


The correspondence between the sets \; (i = 1,2,---,N) and \ (i = 
1,2,--- ,r) is made explicit by stating that the stretching directions e;(X) (i = 
1,2,--- ,N) can be ordered in accordance with that of the nested subspaces 
S) (k) = 1,2,---,r, such that the subspace 5“) is spanned by a set of 
vectors e;(X) with i belonging to a set of indices J“) (say), such that the 


corresponding Lyapunov indices ), (i € I‘) are all less than \?, 


Depending on the signs of the Lyapunov exponents \; (¢ = 1,2,--- ,N), 
which may be positive, negative, or zero, the tangent space at X can be 


decomposed as the direct sum of three subspaces, 
Tx = Ey(X) +€q(X) + €(X); (9-32) 


spanned by vectors e; corresponding respectively to positive, zero, and 


negative Lyapunov exponents. 


One of the Lyapunov exponents has to be necessarily zero in the case of a 
flow, corresponding to the direction of the flow at the point under consider- 


ation. 
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For the sake of completeness we define the local stretching rates y;(X) (4 = 


1,2,--- ,N) in terms of the stretching factors A;(t, X) as 
t 
At XxX) = exp | xi(U,X )dr. (9-33) 
0 


The Lyapunov exponents are given in terms of these local stretching rates 


as 


t 


1 
A\i(X) = lim — xi(U,X )dr. (9-34) 
0 


t-00 


The stretching rates are truly local quantities so that a stretching rate 
may be locally negative while the corresponding Lyapunov exponent is 
positive. The sum of the local stretching rates corresponding to the posi- 


tive Lyapunov indices is termed the local dispersion rate, 


u(X) = S° xi(X). (9-35) 


rA4>0 


In simple terms, the Liyapunov exponents tell us the way surface ele- 
ments of various dimensions, imagined around any given point in the 
phase space, expand or contract in the course of evolution in time. Thus, 
for two initial points X(0) and X'(0) close together in the phase space, the 
distance dd(0) evolves as 6d(t) = dd(0)e*", where \, is the largest Lyapunov 
exponent (we assume that the direction of the tangent vector from X (0) to 
X'(0) is a generic one, and not along one among a number of exceptional 
directions). Similarly, a small surface element 5 A(0) evolves generically as 


SA(t) = 5A(0)eO+2)*, where 2 is the second largest among the Lyapunov 
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exponents. Similar statements can be made about volume elements of 
successively higher dimensions. In particular a volume element dV (0) of 
dimension N (which is necessarily even for a Hamiltonian system) evolves 
as 0V(t) = 6V(0)exp[(A; +---+An)t]. In others words, the Lyapunov in- 
dices give the rates of volume expansion (or contraction, as the case may 
be) of volume elements of various dimensions chosen around any given 
point X(0). In almost all cases of interest in statistical mechanics, the 
Lyapunov exponents are, generically speaking, independent of the choice 


of x(0). 


In closing this section, I mention the important result that the sum of the 
Lyapunov exponents for a flow defined as in (9-25a), (9-25b), is related to 


the average rate of change of volume in the phase space as 


So A(X = mee Df ah (U,X)dr. (9-36) 


a 


In this expression )°, = F;(U,X) vanishes for a conservative system, i.e., 


a Te 
one for which an arbitrarily chosen small volume in the phase space re- 
mains unchanged (though the shape of the volume element may change) 
with time. In particular, the sum of the Lyapunov exponents of a Hamil- 


tonian system is necessarily zero. 


9.5.3 Stable and unstable manifolds 


We begin by defining the local stable and unstable manifolds at any spec- 
ified point Xo on a trajectory, to be referred to as y, which is traced out 


by the transform U;(X0) obtained by translation through time interval t 
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for all t : -co < t < ow. The local stable manifold W,(X0) is defined as the 
set of all X in the phase space such that ||U;X — Xo|| — 0 for t — co where 
||-|| denotes some appropriate metric in the phase space. The local unsta- 
ble manifold W,,(X0) is similarly defined as the set of points X for which 
||UzX — Xo|| > 0 for t > —oo. The local stable and unstable manifolds 
for mappings are analogously defined in terms of forward and backward 


iterations respectively under the mapping. 


The local stable and unstable manifolds are tangent to the stable and 
unstable directions e;(X 9) defined in sec. 9.5.2.2 above. The union of 
all the local stable (resp. unstable) manifolds for all the points X of the 
trajectory 7 is referred to as the global stable (resp. unstable) manifold 
of 7, which we denote by W,(gam) (resp. W,(gam)). These are invariant 


under the time evolution U; for all t. 


Considering the global stable manifold of a trajectory 7 and the unstable 
manifold of a different trajectory 7’, a point X belonging to their inter- 
section (assuming that the latter is non-empty) is termed a heteroclinic 
point, while a point belonging to the intersection of the global stable and 
unstable manifolds of a single trajectory 7 is termed a homoclinic point 


(assuming, once again, that the said intersection is non-empty) 


9.5.4 Linear stability of Hamiltonian systems 
Statistical mechanics is mostly concerned with Hamiltonian systems, 


these being the ones we have been considering in this book. 


Systems following non-Hamiltonian dynamics will be considered below in 
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sec. 9.9, where the non-Hamiltonian nature is reflected in sets of constraints 
operating on the systems under consideration. The constraints represent 


the effect of external systems of certain special types. 


The dynamics followed by a Hamiltonian system is symplectic in nature, 


since the equations of motion can be written in the form 


Says (9-37) 


where X stands collectively for the N = 6N number of phase space vari- 
able made up of the components of the position and momentum vectors 
of the particles in the system (N = the number of particles making up 
the system), H represents the Hamiltonian function, and © denotes the 
fundamental symplectic matrix represented in terms of 3N x 3N blocks 


(the null matrix 0 and the unit matrix J) as 


0. 

—I0 
Generally speaking, the phase space dimension N is even, i.e., N = 2s, 
where the positive integer s is referred to as the number of ‘degrees of 


freedom’ of the system. In the case of a system made up of NV number of 


particles, all moving in three dimensional space, one has s = 3N. 


As a consequence of the symplectic structure of the Hamiltonian equa- 
tions, the eigenvalues of the matrix M1(t, X)M(t, X) (refer to sec. 9.5.2.2; 
M stands for the fundamental matrix defined in (9-26b)) occur in pairs 


of the form o,0~! which, in turn, implies that the Lyapunov exponents 
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occur in pairs of the form \,—, (the pairing rule). Thus the sum of all 
Lyapunov exponents is zero and phase space volumes are preserved (re- 


fer back to (9-36)). 


As we have seen, at least one of the Lyapunov exponents for a flow is 
necessarily zero, corresponding to perturbations (5X) in the direction of 
the flow. In the case of a Hamiltonian system there exist at least two zero 
exponents. If the Hamiltonian is the only non-trivial conserved quantity 
for the flow, there occur only two zero exponents, one corresponding to 
the direction of the flow in the energy surface and the other to the direc- 
tion perpendicular to the surface. These two are referred to as the neutral 


directions 


A Hamiltonian system is conveniently described in terms of its Poincare 
map (refer back to sec. 9.5.1; see [106], [52]). Referred to appropriate 
co-ordinates on the surface of section, the fundamental matrix of the 
Poincare map, which is of dimension (6N — 2) x (6N — 2), describes the 


evolution of small perturbations confined to the surface of section. 


With appropriately defined co-ordinates on the surface of section, the 


Poincare map (which we denote by /) is symplectic, i.e., satisfies 


M'DM =>, (9-38) 


where the matrix » is of the form (9-37b), with s x s blocks 0,J. Asa 
result, the Lyapunov exponents of the Poincare map also occur in pairs 


such as \, —A. 
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In the case of a Hamiltonian system for which the energy is the only 
non-trivial integral of motion, the Poincare map does not have any zero 


exponent. 


For a Hamiltonian system with 2s-dimensional phase space, there exist, in 
general, 2s number of constants of motion. If the system is not characterized 
by invariance under symmetry groups other than time translation, then the 
only analytic integral of motion is the Hamiltonian; the other integrals of 
motion are non-analytic almost everywhere in the phase space and do not 
correspond to pairs of zero Lyapunov exponents. It is in this sense that the 


Hamiltonian is referred to as the only non-trivial integral of motion. 


9.5.5 Chaos: geometrical and dynamical features 


The geometrical (or, topological) features of dynamical systems form the 
basis of an effective and convenient description and classification of these. 
These features are said to refer to the ‘phase portrait’ of the system, The 
method of phase portraits was made the basis of Poincare’s analysis of 
dynamical systems. The geometrical features of dynamical chaos are di- 


rectly related to corresponding dynamical features. 


Geometrical analysis makes heavy use of the idea of invariant sets. In 
the case of a flow, an invariant set Z is one such that for any point X 
in Z, U,X € T for all t. In the case of a mapping ®, an invariant set 7 
is analogously defined as one such that, if X € Z then 6"X € 7 for all 
integers n. Fixed points and periodic orbits (the ‘critical elements’) are 


instances of invariant sets, as are all completed orbits. The stability and 
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recurrence properties of invariant sets are of particular interest in the 


theory of dynamical systems. 


Recalling the definitions in sec. 9.5.1, a point X is termed a wandering 
point of a flow if there is a neighborhood V of X such that the U,X ¢ V for 
all t satisfying |t| > to for some positive tp. A wandering point of a mapping 
is analogously defined. The set of all wandering points (the wandering set) 
is an open invariant set in the phase space. The complement of this set is 
is closed and invariant and is referred to as the non-wandering set. While 
wandering points correspond to transient states of a dynamical system, 
trajectories of points in the non-wandering set correspond to recurrent 
and asymptotic (i.e., long-term’) behavior. In the case of a transitive 
system the non-wandering set coincides with the phase space itself. Fixed 
points and periodic orbits constitute instances of non-wandering orbits, 
though the non-wandering set may contain non-periodic orbits as well. 
In the case of a dissipative system, most initial points tend to a set of zero 
measure and hence most points are non-recurrent. On the other hand, 
recurrence is the generic behavior in conservative systems, as implied by 
the Poincare recurrence theorem: for any point (X) in a bounded invariant 
domain and for any chosen neighborhood (V), if the trajectory of any 
point (Y) in V moves away from it under the flow or the mapping, then 


that trajectory necessarily returns to V after a sufficiently long interval. 


Interesting instances of invariant sets from the point of view of dynamical 
analysis are the attracting and repelling sets. A closed invariant set A 
is termed an attracting set for a flow if there exists a neighborhood V of 


A such that, for all X in V, U,X stays in V for allt > 0 and UX > A 
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for t > oo. Evidently, the unstable manifold of each point in A lies in A 
itself. An analogous definition applies for a mapping. An invariant set 
that is not attracting is referred to as repelling, while an anti-attracting 
set is defined in similar terms as an attracting one, by replacing t with 


—t. At times, the term ‘repelling’ is used in the sense of ‘anti-attracting’. 


The concept of an attracting set leads to that of an attractor when it sat- 
isfies the condition of transitivity. We have previously introduced the idea 
of transitivity in sec. 9.2.3. A flow is transitive if there exists a dense orbit 
in it, i-e., the closure of the orbit coincides with the whole phase space 
M. Transitivity of an invariant set Z is similarly defined by requiring the 


existence of a dense orbit in it. 


If transitivity does not hold for the whole phase space or the whole of 
the non-wandering set, the latter can be decomposed into smaller locally 
maximal transitive invariant sets (an invariant set Z is locally maximal if 
there exists a neighborhood V containing Z such that there does not exist 
a larger invariant set Z’ in V). This is known as the ergodic decomposition 
of the non-wandering set or of the whole phase space. We have come 
across such decompositions in sections 6.2.5.5 and 6.5 in the context 
of broken ergodicity in phase transitions. The ergodic components so 
obtained can be classified into attractors, repellers and anti-attractors. 
Briefly, an attractor is an attracting set with a dense orbit in it. Repellers 
and anti-attractors are defined analogously with reference to repelling 
sets and ant-attracting sets. The basin of an attractor Z is the set of 
all those points in the phase space that are attracted to Z for t > oo. 


Analogous notions apply for mappings as well. 
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A knowledge of attractors with their basins gives a complete description 
of global stability for a dissipative system since one then knows the at- 
tracting set of each trajectory. Distinct neighboring basins of attraction 
are demarcated from one another by repelling invariant sets, referred to 


as basin boundaries. 


9.5.5.1 Hyperbolicity 


We define the property of hyperbolicity ([52]) of an invariant set Z for a 
flow by requiring that for every X in Z, the tangent space can be de- 
composed, with reference to the trajectory through X, into a stable sub- 
space, an unstable subspace, and a center subspace (spanned by a sin- 
gle vector along the direction of flow through X), such that there exist 
a positive constant « and a positive function C(X,Y) with the following 
properties: (1) any perturbation along a vector in the stable subspace 
decays like C(X,U;,X)~'e~™ for t > 0, or faster, (2) any perturbation in 
the unstable subspace grows like C(X,U;X)e™ for t > 0, or faster, and 
(3) the angle 0,,(U,X) between the stable and unstable subspaces at U;,X 
remains strictly non-zero, being bounded below by C(X,U;X)6u;(X). In 
other words,the stable and unstable subspaces are transversal at every 


point X. 


The flow is said to be hyperbolic if there is a single hyperbolic invariant 
set. Hyperbolicity for a mapping is analogously defined, with the modifi- 


cation that now there is no center subspace at any point on a trajectory. 


In the case of a hyperbolic flow, all the Lyapunov exponents \(X) (refer 


to (9-28a)) are strictly bounded away from zero with the exception of the 
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only zero exponent corresponding to the direction of the flow (in the case 
of a mapping this Lyapunov exponent is non-existent). Thus, among 
the r number of distinct Lyapunov exponents (r < N), there are, say, v 
number of distinct positive exponents (v < r — 1) none less than «, and 
r—v—1 distinct negative exponents none greater than —x«. In other words, 
there is a clear gap between —« and « in which there lies a Lyapunov 
exponent other than the sole zero exponent (in the case of a Hamiltonian 
system, there exist two zero exponents). Analogous statements can be 
made about a hyperbolic mapping for which, however, the zero exponent 


is absent. 


In contrast to a hyperbolic invariant set, a non-hyperbolic invariant set 
in a flow is one for which there exist more than one zero Luapunov ex- 
ponent or a spectrum of Lyapunov indices accumulating to zero. While 
hyperbolic invariant sets persist under small perturbations of the flow 
(structural stability), non-hyperbolic invariant sets are generally unsta- 


ble to structural perturbations. 


9.5.5.2 Escape-times for attractors and repellers 


Of special interest in the study of chaotic dynamical systems as models of 
equilibrium states and stationary states in statistical mechanics are the 


hyperbolic attractors and repellers (refer to sec. 9.5.5). 


Considering the union of the stable (resp. unstable) manifolds of orbits 
passing through all the points of a repeller (which, by definition, is an 
invariant set Z), one obtains the stable (resp. unstable) manifold of the 


repeller W,(Z) (resp. W,(Z)). The repeller itself coincides with the inter- 
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section of the stable and unstable manifolds, 
(repeller :) Z = W,(Z)NW,(Z), (9-39a) 


A hyperbolic attractor, on the other hand, coincides with its unstable 


manifold, 
(attractor :) Z = W,(Z). (9-39b) 


We now introduce the idea of an escape-time (refer to [52]). For a neigh- 
borhood U of a locally maximal non-wandering set Z, one defines future 
and past escape times of trajectories as follows: the future escape time 
ni? (X)) for a trajectory initiated at X ¢ U is the maximum value of T(> 0) 
for which the U,X stays in U for all t satisfying 0 < t < T. Similarly, the 
past escape time (TO (x )) is the minimum value of T(< 0) for which U,X 


stays in U for all t satisfying T <t <0. 


In the case of an attractor, if / is the fundamental neighborhood (i.e., 
a neighborhood for which U,U/ c U for all t > 0), then Ux ).= oo. Tor 
every X € YU; on the other hand, TEX ) is finite for all X except those 
on the unstable manifold, while Tx ) = —oo for all points X in the un- 
stable manifold (recall that the attractor itself coincides with its unstable 


manifold). 


For a hyperbolic repellor, on the other hand, aes ) is finite except 
for points on the stable manifold, for which the future escape time is 


oo. Analogously, Toe ) is also finite except for points on the unsta- 
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ble manifold for which the past escape time is —co. In other words, 
(ee (X) + Vaaes )| is finite except when X lies in the complement (in 7/) of 
W,(Z) UW, (Z). 


A repeller is characterized by the set of trapped trajectories. Considering 
the neighborhood U, and a time T (0 < T < i), we define D) as 
the set of points in Y such that the trajectories initiated at these points 
stay within // for all t from 0 to 7. Similarly, we define TOT) as the 
set of points in // such that the trajectories initiated at these points stay 
within U/ for all ¢ from —T to 0 (this requires ee < —T < 0). The repeller 
(Z) then coincides with the intersection of these sets containing trapped 
trajectories, considered in the limit T — oo, 


T= lim LPT) NT (7). (9-40) 


TT 00 


A simple instance of trapped trajectories is provided by the Smale horseshoe 


for which the repeller is the product of two Cantor sets (see sec. 9.5.8.3). 


9.5.5.3 Axiom-A systems and Anosov systems 


An Axiom-A system is one for which: (1) the non-wandering set is contin- 
uously hyperbolic (i.e., the stable and unstable subspaces at any point 
X vary continuously with X; at times, the requirement of continuity is 
dropped; on the other hand, smoothness is assumed for some purposes) 
and compact (as in the case of an energy shell or energy surface of finite 


measure), and (2) periodic orbits are dense in the non-wandering set. 
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In the case of a two dimensional manifold, the hyperbolicity requirement 
implies the density of periodic points; the density requirement is, however, 


necessary for higher dimensions. 


If, moreover, the non-wandering set coincides with the whole phase space, 
the flow or the mapping is termed an Anosov system (more precisely, the 


entire manifold M is to be hyperbolic). 


Axiom-A attractors are relevant in the context of SRB distributions for 
open dissipative systems (sec. 9.10) in non-equilibrium statistical me- 
chanics, and for isolated conservative systems too where the entire phase 
space of the latter can be looked upon as being characterized by a num- 
ber of features of Anosov dynamics. Restricting the dynamical system 
to such an attractor, one can assume the validity of many of the results 
derived for Anosov systems including, in particular, the existence of the 


SRB measure on the attractor ([12]). 


The Axiom-A and Anosov systems possess the important property of be- 
ing structurally stable, i.e., their geometrical features remain unchanged 
under a small change of the structure of the flow or the mapping un- 
der consideration. It is important to note that structural stability does 
not conflict with dynamical instability implied by the existence of positive 
Lyapunov exponents. Structural stability, however, is compromised for a 
number of important systems of physical relevance such as Lorentz gases 


(sec. 9.9.7) under certain conditions. 


The feature of hyperbolicity along with compactness leads to chaos. In 
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this sense, Axiom-A systems constitute simple instances of mixing dy- 
namics — simple in the sense that these can be shown to possess a num- 
ber of interesting and desirable features such as the existence of the SRB 


measure (subject to the requirement of transitivity) mentioned above. 


In the case of systems of interest in statistical mechanics, where the dy- 
namics is determined by the interactions between an enormously large 
number of particles, it is a near-impossibility to explicitly demonstrate 
that these satisfy the axiom-A or Anosov properties. Nevertheless, these 
can be modeled as axiom-A (or Anosov, as the case may be) systems by 
assuming that the desirable features of these simple types of systems 
hold for the real-life systems of statistical mechanics as well, including 
the existence of the SRB measure, which reduces to the microcanonical 


measure in the case of an isolated Hamiltonian system. 


9.5.5.4 Markov partitions 


A Markov partition (refer to [46], [52], [29]) of a transitive invariant set Z 
is a finite cover of Z by approximately rectangular disjoint cells — closed 
regions whose boundaries are made up of segments of stable and un- 
stable manifolds defined in terms of the intrinsic hyperbolic dynamics of 
the system under consideration. We assume that the dynamical system 
is defined in terms of a diffeomorphism ®; in the case of a flow the dif- 
feomorphism can be taken to be a stroboscopic map or a Poincare map 


constructed from it. 


The defining property of the partition is the following: considering a point 


X inacell, say, C with ®X belonging to, say, cell C’, the part of the stable 


1276 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


manifold of X lying within C is to be mapped under ® into the part of 
the stable manifold of &X lying in C’; and similarly, the image of the part 
of the unstable manifold of X contained in C has to include the part of 
the unstable manifold ®X contained in C’. Put differently, the images 
under © of the stable sides of the partition are to be contained within the 
union of stable sides of the cells, and the unstable sides are to be mapped 
by ®~' to within the union of unstable sides (refer to [52], [29], [46] for 


details), i.e., 


SOMO CUgaC’, BOM E UsdMC’, (9-41) 


where 0°), 0 denote the stable and unstable boundaries of a cell. Thus, 
no new stable boundaries are created if a cell C is evolved under ®, and no 
new unstable boundary is created if it is evolved under 6~'. By repeated 
applications of and @~', and intersections of the regions created in the 
process, finer and finer Markov partitions can be constructed. The cells in 
such a partition can be used to construct a coarse-grained representation 
of the dynamics under ®,@~! as a Markov process, and to approximate 
the corresponding stationary distribution. Further, on going to the limit 
of an infinitely fine Markov partition, one can obtain the Kolmogorov-Sinai 
entropy (KS entropy, see sec. 9.5.5.7) for the dynamics, which constitutes 


a quantitative indicator of dynamical chaos of the system. 


Considering a sufficiently fine Markov partition with Markov cells E; (i = 
1,2,--- ,L), we define the transition matrix M with elements M;,; (i,j = 
1,2,--- ,Z) such that M;; = 1 if the interior of ®£; has a non-vanishing in- 


tersection with the interior of L;, and M;; = 0 otherwise. Then, with each 
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point X in the phase space we can associate a bi-infinite sequence (q) of 
symbols q;, where the latter are integers belonging to the set {1,2,--- , LZ} 
under the above labeling of the cells (q = ...¢-2q-1-qomq.--. (G € {1,2,--- , L}, -co < 
i < co)), and where ®*X € E,, (k = 0,+1, +2,---). The position of the sym- 
bol ‘’ in this sequence (referred to as the symbol sequence generated by X) 
tells us that X € E,, and ®*X € E,, (k =0,+1,+2,---). The same sequence 
with the symbol ‘.’ removed then describes the itinerary of X among the 
Markov cells under forward and backward iterates of ® of all orders. As- 
suming that the partition is a sufficiently fine one, the correspondence 
between points X and symbol sequences q is one-to-one (excepting for 
points X lying on the boundary of some cell or other; such points make 
up a set of measure zero). In the following, a symbol sequence will be 
used to denote either a point X or its itinerary (i.e., with or without the 


symbol ‘.’ inserted), depending on the context. 


The transformation X — ®X defined by the mapping ® induces a trans- 
formation q > q’ = ®q, where q’ is obtained from q by shifting the se- 
quence one place to the left, so that q,,; now occupies the position previ- 
ously occupied by q (k = 0,+1,+2,---). In other words, q’ = ...q-29-19q0- 


M92... 


Referring to the transition matrix M introduced above, the symbol se- 
quence q corresponding to the point X will evidently have to satisfy the 


compatibility condition M. 


dedeu1 = 1, i.e., the only allowed sequences under 


the dynamics defined by ® are the ones satisfying this compatibility con- 
dition. The transitivity of the set Z (the entire phase space in the case of 


an Anosov system) implies that for some positive integer K, M* has no 
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zero entry. 


The above considerations tell us that for a hyperbolic and transitive in- 
variant set under the mapping ®, one can define a Markov partition and 
a corresponding transition matrix M such that the dynamics under ® 
can be described as a Markov process represented by a shift operation 
on symbol sequences compatible with /. This provides the basis for a 
large number of desirable features of dynamical systems of the type under 
consideration including, in particular, Axiom-A systems (recall that an 
Anosov system constitutes a particular instance of an Axiom-A system). 
In particular, of great relevance is the feature of existence of an invariant 
measure, namely, the SRB measure. A special case of a Markov process 
represented by a shift on the space of compatible symbol sequences, is a 
Bernoulli shift for which the compatibility condition is expressed in terms 
of a transition matrix M that has no zero entry. This is the case, for 
instance, with the Baker’s map (sec. 9.5.8.1) and the Smale horseshoe 


(sec. 9.5.8.3). 


Generally speaking, a Markov partition provides a coarse-grained de- 
scription, in terms of a Markov shift, of the dynamics of the system under 
consideration in the phase space. However, if the Markov partition is fine 
enough, the shift dynamics on the space of compatible symbol sequences 
turns out to be a faithful representation of the phase space dynamics, in 
which case the Markov partition is termed a generating partition. Gener- 
ating partitions are mostly made up of infinite numbers of boxes, though 


ones with a finite number are also possible. 
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An Axiom-A system, being hyperbolic, involves stretching along the un- 
stable directions at every point i.e., exponential separation of nearby tra- 
jectories (or, put differently, a sensitive dependence on initial conditions) 
and, at the same time, a folding of trajectories because of the constraint 
imposed by a compact phase space (or of the invariant set in which the 
motion remains confined). As we saw in sec. 9.4, these are the char- 
acteristic features of mixing. In other words, Axiom-A systems provide 
instances of mixing dynamics. In the case of an isolated Hamiltonian 
system the feature of mixing leads to the existence of the microcanonical 
distribution which corresponds to the natural measure for the dynamics 
(i.e., itis a stationary distribution and almost all initial distributions tend 
asymptotically to this distribution in a weak sense). In the case of a dissi- 
pative Axiom-A system, the existence of a natural measure is once again 
assured on an attractor — this being the SRB measure mentioned above 


(see sec. 9.10). 


The feature of mixing results in chaotic dynamics on an Axiom-A attrac- 
tor, i.e., over the entire phase space in the particular case of an Anosov 
system. A quantitative measure of the prevalence of chaos in a dynamical 
system is provided by its entropy, which can be defined in one of several 
possible ways. Among these, we consider below the topological entropy 


and the Kolmogorov-Sinai (KS) entropy. 


9.5.5.5 Topological entropy 


The topological entropy ([52],[148]) is a quantitative indicator relating to 


the geometrical features of an invariant set (Z) of a dynamical system, 
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providing a measure of the number of orbits that can be distinguished 
to within a given resolution in a given time. This is made more precise 
in terms of the idea of a (c,n)-separated set. We assume that the phase 
space M for the dynamical system defined by a mapping ® has a metric 


d(X,Y) defined in it, where X,Y denote any two points. 


The various entropies characterizing a dynamical system are measures of 


the rate of increase of dynamical complexity as the system evolves in time. 


We say that a set FE C T is a (e,n)-separated subset if, given any « > 0 and 
a positive integer n, there exists an integer m (0 < m < n) for every pair 
of points X,Y in it such that d(6"X,®”Y) > ¢, ie., the orbits originating 
in X,Y can be resolved (to accuracy «) within n time steps (every iteration 


under ® being counted as a time step). 


The topological entropy hi.»(®) measures the maximum rate of prolifer- 
ation of such ¢«-separated orbits for arbitrarily small «. If N(¢,n) denotes 


the maximum cardinality of all the («,n)-separated sets, then 


1 
htop(®) = lim | lim sup In N(e,n)]. (9-42) 


€30 © n-00 


In other words, if h denotes the topological entropy then the number of 
orbits that can be resolved in n time steps grows with n roughly as e””. 
The topological entropy itself measures the rate of proliferation of orbits 


of the system with time. 


Th topological entropy can also be expressed in terms of the (¢, n)-spanning 
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sets. A (e,n)-spanning set is EF is one such that, for every X in 7, there is 
a Y such that (d(®@"X,®”Y) < ) for all m < n; one then takes N(e,n) to be 


the minimum cardinality of such spanning sets. 


The topological entropy so defined turns out to be zero if Z has a simple 
structure such as one containing only a single or finitely many periodic 
orbits, while it works out to a finite positive value for a compact invariant 
set when the latter has a complex structure such as one including non- 
periodic orbits and infinitely many periodic orbits. One can thus define 
topological chaos as a feature of the dynamics in Z (or in the entire phase 


space M) when hiop > 0. 


9.5.5.6 Periodic orbits 


The number and disposition of periodic orbits of a flow or mapping are 
features of great relevance since, in a chaotic system, these form a dense 
set in the phase space or in the relevant invariant set. The minimum 
number of time steps required for the traversal of a periodic orbit of a 
map © is referred to as the prime period of that orbit, with an analogous 
definition in the case of a flow. A periodic orbit of prime period T,, is also 
a periodic orbit of period rT,, where r is any positive integer, though only 
the value r = 1 corresponds to the prime period. Here the index p runs 


over the set (P,say) of all prime periodic orbits. 


The distribution of the periodic orbits (the spectrum of periods and their 
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repetitions) is described by the distribution function 
p(t) = S° 5° 7, 6(t — rT), (9-43a) 
r=1 pEeP 


while its integral gives the cumulative distribution function 


R(t) = - 5) T,O(t — rT); (9-43b) 


r=1 peP 


where O(t) stands for the Heaviside step function. 


The Laplace transform of the distribution function p(t) works out to (see [52]) 


OF aa Ctop(S) 
*dt = — 9-44 
i aie wat Gtop(s)’ _ 


where ¢:.p(s) stands for the topological zeta function 


Ctop(s) = I] — (9-44b) 


_ e—8Ip : 
pEP 


The analytical structure of the topological zeta function contains all the 
information about the spectrum of the periods of the periodic orbits and 
the rate of growth of the cumulative function R(t). The topological entropy 
constitutes an upper bound on the growth of the cumulative function, 
which implies that the periodic orbits proliferate at most exponentially in 


a deterministic dynamical system. 


In the case of a hyperbolic system, the periodic orbits are conveniently de- 


scribed in terms of repeating symbol sequences associated with a Markov 
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partition. This correspondence between periodic orbits and symbol se- 
quences is found to imply the following exponential growth of the cumu- 


lative function 


el htop 


gee 


R(T) ~ (9-45) 
This estimate also implies that the topological zeta function has a pole at 


s = htop, the topological entropy. 


At the same time, the number of periodic orbits with prime period p such 


that T, < T is seen to grow as 


el htop 


Op 


(9-46) 


In summary, periodic orbits constitute an important indicator of chaotic 
dynamics in hyperbolic systems. Many of the periodic orbits of a system 
can be determined numerically and can be made use of in determining 
transport properties of model systems, the latter being features arising 


from irreversible time evolution in these systems. 


9.5.5.7 Kolmogorov-Sinai entropy 


The Kolmogorov-Sinai (KS) entropy (also referred to as the ‘metric entropy’; 
[52],[148], [29]) gives a measure of the randomness or chaos inherent in 
the dynamics of a system, as does the topological entropy. While the two 
entropy measures count the rate of growth in the number of orbits that 
can be distinguished from one another, the latter is based on a purely 


geometrical mode of counting, whereas the former is probabilistic in na- 
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ture, with appropriate weights associated with the orbits. Both the two 
entropies stand for the rate at which the logarithm of the number of orbits 
(a measure of complexity) grows in time and, in this sense, are analogous 
to Boltzmann entropy per unit volume in statistical mechanics, which 
measures the rate at which the total number of distinguishable micro- 


scopic states of a system in the phase space increase with the volume. 


The definition of the KS entropy takes into account an invariant measure 
p(dX) of the dynamical system under consideration (which we assume to 
be given), and in this sense, refers to a dynamical feature of a system 
while the topological entropy measures a geometrical feature of chaos. 
Corresponding to the invariant measure, one can assign a probability p = 
J, u(dX) to any chosen region B in the phase space. We then consider a 
finite partition of the phase space into kK number of regions B,, Bo,--- , Br, 
where K is a positive integer, the probability corresponding to B; being 
pref p, H(dX). Knowing that the phase point lies in any one of the regions 


B; (i =1,2,--- , k) corresponds to an amount of information 


I(8o) = — Sop In pi, (9-47 a) 


where {3 denotes the initial partition mentioned above. We now consider 
the pre-image ®"'B; of B; (i = 1,2,--- , K) (which is defined even if ® is not 
invertible), and take the intersections of all the B,’s with all of the ®-'B,’s 
so generated. We then get a finer partition (6,) of the phase space into 
regions B,; = B;1 ®~'B; (i,j = 1,2,---K). Knowing that X lies in box B; 


tells us that it is located within any one of the boxes B;; one time step 
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back, for which the information is 
1(8,) =— > py ny (Pig = | pAEX)F 2,9 = 1, 250+ i). (9-47b) 
ij Bij 


This process can be continued. With each backward time step, finer and 
finer partitions are generated resulting in progressively larger amounts 
of information regarding the location of the phase point. The KS entropy 
is then defined as the rate of production of information in the limit of an 
infinitely fine partition of the phase space: 


m—>oo MN 


(more precisely, the supremum of the limit for all possible choices of the 
initial partition 5), which makes the KS entropy independent of the choice 
of the partition, and a quantity depending on the intrinsic dynamics), 


where ,,, denotes the partition generated after m time steps: 


Ba= (BRNO BA N® °B, fe “Be } gto ia = 12K). 
(9-48b) 


A compact notation for the above collection of sets is 


Bm =Bv ®@-VB vy &AB Vv... v OCB, (9-49) 
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In the above expressions, /((,,) stands for 


Ti Bin:) —— Ss Pioit--im I 5 ei cath 25 (9-50a) 


ioti-tm 


where 


(dX). (9-50b) 


Digit-im = | 
BigN®-1 Bi, N®-? Bi, A NO-™ Bz, 


1. The analogy with the Bolzmann entropy in statistical mechanics is 
apparent, where one considers the partition of the phase space into 
W number of microstates, each specified in terms of a small phase 
cell. The invariant measure in this case corresponds to the micro- 
canonical ensemble, i.e., uniform probability for all the phase cells 
(pi = y (i =1,2,--- ,W)). The Boltzmann entropy is then given, in units 
of kp, by — yy p;lnp;. In other words, both the KS entropy and the 
Boltzmann entropy (or, more generally, the Gibbs entropy correspond- 
ing to the canonical ensemble) derive from the expression for Shannon 
information. It is to be mentioned, however, that the KS entropy of 
a dynamical system is defined in terms of the invariant measure w. 
Given a map ® in the phase space, one has to first ensure the exis- 
tence of the invariant measure and then determine the latter, which, 
in general, is a tall order. In the case of an Axiom A attractor, one 
knows that a natural measure exists, namely, the SRB distribution to 
be introduced in sec. 9.10. In equilibrium statistical mechanics, on 
the other hand, the microcanonical distribution provides the natural 
measure, as asserted by Boltzmann who did not have any existence 
theorem to bank upon and still went on to propose the ergodic hy- 


pothesis. 


2. The probabilities p;,;,...;,, can be interpreted as correlation functions of 
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observables corresponding to the characteristic functions of the cells 
constituting the partition under consideration [52]. Since the KS en- 
tropy is based on such n-time correlation functions, with n — oo, it can 
be said to describe extremely fine correlations between observables at 


successive instants of time. 


An alternative interpretation of the KS entropy is to consider all possible 
orbits starting at X located at the initial time step (m = 0) from within any 
specified box, say, B;, in some partition, say, Jo. An itinerary in m time 
steps corresponds to a symbol sequence specifying the successive boxes 
visited by the phase point, where the invariant measure ju(dX) induces a 
measure on the space of these symbol sequences. The KS entropy is then 
defined as the average information regarding the exact initial location X 
produced per time step for any specified symbol sequence in the limit 
m —> co, where one has to consider the supremum of this information 


measure over all possible partitions {. 


If the partition under consideration is a generating Markov partition (re- 
fer back to sec. 9.5.5.4) then the formula (9-48a) directly gives the KS en- 
tropy, without the necessity of comparing with other possible partitions 


and working out the supremum over all partitions. 


9.5.6 Dynamical measures and topologial pressure 


The content of this section is based principally on [52], chapter 4; see 


also [25], [29]. 


While a dynamical system is defined with reference to a stationary mea- 
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sure, many different probability measures on the phase space can be de- 
fined that depend on the dynamics. Given a phase space function A(X) 
one can define such a dynamical measure, to be denoted by pw4(dX), by 
first considering an (c, 7’)-separated subset and then assigning a probabil- 
ity to each separated orbit passing through dX proportional to exp [ a A(U.Y )dt, 


where Y denotes the point of origin of the orbit. 


The definition of a (e,T)-separated subset S in a flow is analogous to that of 
an (€,n)-separated set in a mapping (refer back to sec. 9.5.5.5); one requires 
that the orbits originating from any pair of points in the set be resolved, 
with reference to a given separation «, during a given time interval, which 
we take to be the interval from —T to T. In the case of a compact phase 
space or a compact invariant set in it, one can find a S made up of a finite 


number of points. It is such a subset that we refer to in the following. 


The normalized probability assigned to the orbit is then given by 


‘- 
exp |_,, A(ULY )dt 
pa(e,T,Y) = ft - A (9-5 1a) 
where 
ty 
Z(e,T, A) = sup > exp f A(U.Y )dt, (9-51b) 
° Yes ee 


where the supremum is taken to eliminate the dependence on the (¢,7)- 


separated subset we started with. 
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The topological pressure is then defined as 


P(A) = lim Jim _ In Z(€,T, A). (9-52) 
It may be noted that the the definition of Z is analogous to the partition 
function in statistical mechanics, while that of the topological pressure 
is analogous to the free energy. The probability p4 is analogous to the 
Gibbs probability mee where the dynamical quantity i A(U,X )dt, evalu- 
ated over an orbit through X plays the role of (—6 times) energy, a static 
variable in the phase space. Accordingly, Z(¢,7, A) is referred to as a dy- 
namical partition function (or, more precisely, a functional depending on 


the observable A). 


Referring to the dynamical probability functional p4(¢, 7, Y) defined in (9-5 1a), 


one can define a dynamical measure ju4(dX) as 


l T 
ja(dX ) = lim lim sup S° gpPale ie y) | O(X — U,Y )dtdx. (9-53) 
ES 


<0T+00 § La 
Such dynamical measures are referred to as Gibbs measures. The limit 
T — oo in the above expression is analogous to the thermodynamic limit 
in equilibrium statistical mechanics since, in this limit, j14(dX) gives an 
invariant measure in the phase space. Depending on the type of the sys- 
tem under consideration, its natural measure is obtained as a particular 
instance of these generalized Gibbs measures. Thus, one is led to the SRB 
measure for a closed hyperbolic system while, in the case of a repeller of 
an open system, a Gibbs measure is obtained which can be looked upon 


as a generalization of the SRB measure. 
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The topological pressure has a number of desirable properties. Thus, for 


any two observables A, B, one has 
(convexity :) P(vA+(1-—v)B) < vP(A)+(1-—v)P(B) (O<v <1), (9-54a) 


and 


P(A+B) < P(A) +P(B). (9-54b) 


Further, in the special case A = 0, the topological pressure reduces to the 


topological entropy of the dynamical system under consideration, 
P(0) = Rises (9-54c) 


where the relation may apply either to the entire phase space or to an 


invariant set in it, depending on the context. 


The topological entropy for a flow can be defined in a manner analogous to 
that for a mapping (refer to sec. 9.5.5.5), by making use of (ce, T)-separated 


subsets. 


Referring to the dynamical measure j14 defined for the observable A (as 
in (9-53)), the average of an observable B with respect to this measure is 


given in terms of the topological pressure as 
d 
(Bu = f BO)ualdz) = [FPA +B) 5 (9-55) 
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Taking B = A, this formula can be used to define (A),,,. Further, the mea- 
sure j14(dX) can be made use of in defining a KS entropy hxs(j4) (recall 
that the KS entropy is defined with reference to an invariant measure, 
which is commonly taken to be the natural measure under the dynam- 


ics). The following fundamental identity then holds, 
hxs(Ha) = —(A)us + P(A), (9-56) 


analogous to the relation between the entropy, the internal energy, and 
the free energy of a thermodynamic system in equilibrium. Further, 
analogous to the variation principle of equilibrium statistical mechan- 
ics where the equilibrium state corresponds to the minimum value of the 
free energy functional against variations of the probability distribution, 


one obtains the variational principle 
P(A) = sup[hxs(1) + (A): (9-57) 
a 


In this case, the supremum is realized for , = 4, for which the iden- 


tity (9-56) holds. 


An important special case of the dynamical measure and topological pres- 


sure defined for the arbitrarily chosen observable A is obtained by taking 


A(X) = —Bu(X) = -B >) xi(X), (9-58) 


AG>0 


where the local stretching rates y;(X) (i = 1,2,--- ,N) and the local disper- 


sion rate u(X) have been introduced in sec. 9.5.2.2 (see formulae (9-31), 
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(9-33), and (9-35)). In the above formula ( is a free parameter introduced 
to define a family of pressure functions and is to be distinguished from 


the inverse temperature. 


The topological pressure functional now becomes a function of the single 


parameter £, 


Ps=P(-B >, xi); (9-59) 

Ay>0 
(recall that for a hyperbolic system the Lyapunov exponents are strictly 
bounded away from zero). We denote the dynamical measure correspond- 


ing to (9-58) by yg. The fundamental identity (9-56) appears as 


hxs(s) = B(U) ys + Pa. (9-60) 


The deep analogy between the dynamically defined quantities such as 
fa, P(A), hxs(44) (or, us, Ps, hxs(44s)) and quantities made use of in equi- 
librium statistical mechanics provides a powerful tool for the analysis of 
chaotic dynamical system, referred to as the thermodynamic formalism. 
As in the case of equilibrium statistical mechanics, this approach can be 
looked upon as an application of the large deviation formalism to dynam- 
ical systems, where the role of the thermodynamic limit is played by the 


limit of infinitely large time intervals. 


As in the large deviation principle, the the thermodynamic formalism ad- 
mits of a Legendre transformation — a chaotic dynamical system is charac- 


terized by an entropy function 5(¢) (see [52], chapter 4) in addition to the 


1293 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


topological pressure P;, the latter being the analog of the free energy of a 
thermodynamic system. The entropy is defined by means of the following 


relation 


dW (4) ~ exp(2TS(9)) |] 64: (T+ 00), (9-61) 


i=1 


where 6W(¢) stands for the number of points in an (c,7')-separated sub- 
set such that the averages of the local stretching rates ¢; associated with 
these points, evaluated over the time interval from —T to T lie between ¢; 
to ¢;+46¢; (i = 1,2,---L). In this expression, ¢ is a vector parameter having 
L number of components ¢,, ¢2,--- ,¢@,, where L stands for the number of 
positive Lyapunov exponents. The entropy function is related to the pres- 
sure function by a Legendre transformation, and the topological entropy 


and the KS entropy can both be derived from it [52]. 


Referring to the family of pressure functions P;, it remains to identify 
the value of 3 in any given context so that the jg stands for the natural 


measure for the dynamical system under consideration. 


9.5.6.1 Dynamical measure in closed hyperbolic systems 


For instance, let us consider a closed system, i.e., one from which no tra- 
jectory can escape through a boundary. In the case of a time-independent 
Hamiltonian system, which has been our focus of concern throughout 
this book up to and including chapter 7 (recall that chapter 8 is devoted 
mainly to non-equilibrium problems involving time-dependent perturba- 


tions), the natural measure is known to be represented by the micro- 
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canonical ensemble (which we denote by j.,) which, in the present con- 


text, corresponds to the value 6 = 1, 
Meg = [(8=1)- (9-62) 
In this case it turns out that the topological pressure is zero, 
Poga1y = 0, (9-63a) 


and the fundamental identity (9-60) appears as 


hxs(Heq) = 9 — (Ai)ueq = > i (9-63b) 


A;>0 rA>0 


Here the last equality is written by recalling that the local Lyapunov ex- 
ponents \;(X) are independent of the location of the phase space point X 


(except for a set of measure zero) in the case of an ergodic measure. 


The formula (9-63b) is referred to as Pesin’s identity. It holds for invari- 
ant measures more general than the microcanonical measure which is 
smooth along both the stable and unstable manifolds of the system un- 
der considerations. More generally, the SRB measures that characterize 
Axiom-A attractors in dissipative systems (sec. 9.10) are smooth along 
the unstable manifolds but not necessarily so along the stable manifolds, 
for which the Pesin identity continues to hold. For a system for which 
the natural measure is not smooth along the unstable manifold as in the 


case of an open system with escape, the Pesin identity is replaced with 
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Ruelle’s inequality, 


his < D0 :- (9-64) 


A4y>0 
9.5.6.2 Dynamical measure in open hyperbolic systems 


In statistical mechanics, one often encounters open systems where par- 
ticles from within a finite volume (V) are absorbed or removed at the 
boundary, and only a few of those remain trapped in a steady state. The 
latter is thus described in terms of the natural measure on the repeller 
made up of the trapped trajectories in a bounded region of the phase 
space. This measure is obtained by referring to the escape rate defined 
as follows [52]. The system under consideration is assumed to be a re- 


versible Hamiltonian one. 


We start from a set U sufficiently large to contain the repeller and in- 
troduce a uniform measure on it that can be constructed in a computer 
experiment by assuming that a large number of particles (No — oo) are 
placed randomly in the set. We denote this measure (uniform in / and 
zero outside) by jio(dX). As particles escape (from within V and hence 
from the region U/ in the phase space), the fraction of initial points that 


remain in U/ in time T > 0 is given by (refer to sec. 9.5.5.2) 
fu (T) = po( Ty (T)). (9-65a) 


Similarly, the fraction of particles that remain within U/ for all times from 
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—T to 0 can be expressed as 
Dut): (9-65b) 
while the fraction that remains in time —T to +T is 
fu(T) = bo(Tu(T)), (9-65c) 
where 
nat arg: (9-65d) 
Defining the future and past escape rates as 
yt) = — lim In eT yO) = — lim = In ADD) (9-66a) 
T>0 T A : T>0 T u : 
the reversibility of the system implies y+) = 7), both being equal to 
= li : l fg (9-66b) 
7= jim 5p nfl), = 


the escape rate for the repeller. 


In the long run (t — +oo), the initial set of points that remain trapped are 
distributed in accordance with the natural measure on the repeller that 


can be expressed in the form 


+T 


: 1 
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(reason this out) where /;(X) stands for the characteristic function for the 
set Y (with value 1 if X belongs to TY, and 0 otherwise). This expression 
can be used for an approximate construction of the natural measure by 


the simulation of the dynamics in computer experiments. 


On the other hand, one can define a dynamical measure on the repeller 
as in (9-53), with A(X) = —6u(X) = —8)°)..9 xi(X) as in (9-58). It then 
turns out ([52], chapter 4) that this dynamical measure is the same as 
the measure //z of (9-67) for the value 8 = 1, as in the case of a closed 
system. One also obtains, as a corollary, the result that the pressure P;_; 


is related to the escape rate as 
Poot = -7. (9-68) 
The fundamental identity (9-60) now appears in the form 
= S00) un — bxs(r)- (9-69) 


This is the generalization of the Pesin identity (9-63b) to open systems. 
While it relates characteristic features of the dynamics of a system, it can 
be made use of in working out transport properties such as the diffusion 
coefficient for non-interacting particles scattered by fixed discs within 
absorbing boundaries, where the escape rate is seen to be related to the 
diffusion coefficient. The escape rate theory constitutes an important field 


of application of dynamical chaos to problems in statistical mechanics. 
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9.5.7 Ruelle-Pollicott resonances and the decay of cor- 


relations 


In this section, we return to the consideration of the problem of approach 
to equilibrium in the Liouvillean dynamics of a Hamiltonian system, ini- 
tiated in sections 9.3 and 9.4. Here we state a few relevant results, on 


which further details are to be found in [52]. 


The equilibrium state corresponds to an invariant measure and the ap- 
proach to the invariant measure depends on the boundary conditions in 
addition to the Hamiltonian describing the dynamics. In other words, two 
systems described by the same Hamiltonian behave in the same manner 
locally in the phase space while the large scale structure of the invariant 
measure and the mode of approach to it may be quite different for the 


two. 


The invariant measure j1.g may differ in the two limits t + +oo even for a 
system with a time-reversible Hamiltonian if the boundary conditions do not 


respect the time reversal symmetry. 


AS we saw in sec. 9.4, the expectation value of an observable with ref- 
erence to the evolving distribution function asymptotically approaches 
its equilibrium value for a system with mixing dynamics. More generally, 
mixing ensures that the time correlation function for any two observables 


decays asymptotically as stated in (9-21b), which we rewrite as 


lim Cap(t) = 0, (9-70a) 
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where 
Capt) = (A(UULX)B(X)) — (A(X))(B(X)), (9-70b) 


the averages being taken with reference to the equilibrium distribution. 
In order to explore the effect of the dynamics on the decay of correlations 
such as Cy4z(t), it is rewarding to look in the frequency domain, i.e., to 


consider the spectral functions such as 


Sap(w) = 7 e™'Cap(t)dt, (9-71) 


(oe) 


where S,,4, corresponding to B = A, is referred to as the power spectrum 


of A. 


Spectral theories were developed (independently and almost simultane- 
ously) by Koopman and von Neumann who considered the spectral func- 
tions for real values of w. Subsequently, Pollicott and Ruelle (again, inde- 
pendently) extended the theory to the complex frequency domain, thereby 


widening its scope. 


In the spectral theory based on real frequencies, one distinguishes be- 
tween systems with a discrete spectrum and those for which the spec- 
trum includes a continuous part. In the former case, the dynamics is 
quasi-periodic, which does not involve decaying correlations, while de- 
caying correlations arise for systems with a continuous spectrum. In the 
latter case, extending the frequency domain to the complex plane (the 


Pollicott-Ruelle theory) allows one to distinguish between different types 
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of decay such as exponential and algebraic ones. 


Looking at the correlation function (9-70b), one is led to the spectral 
decomposition of the Koopman operator K; (= e’“‘) in the Hilbert space 
of observables since A(U;X) = K,A(X). The Koopman operator is unitary 
in the case of a time-reversal invariant Hamiltonian (and £ = 7L is self- 
adjoint), and its spectrum is confined to the unit circle, with the spectral 


decomposition of the form 
Ki= ee = [ewe (9-72) 


where the second term on the right hand side corresponds to the continu- 
ous spectrum, with —7 < w <7 in the case of a mapping and —co < w < co 
in the case of a flow. In the above expression, €, and €(w) are self- 
adjoint projection operators, projecting on to the subspaces spanned by 
the eigenfunctions (or generalized eigenfunctions) of K;, and constitute 


the resolution of the identity operator (J), 


ia / dei). (9-73) 


In the case of a dynamical system defined in terms of a mapping ® (recall 
that a mapping arises naturally for a Hamiltonian system in the form of 
a fixed-time map or of a Poincare map), the role of the Koopman operator 


and the Frobenius-Perron operator are defined only for discrete ‘times’ (n = 


0,+1,+2,---). For a measure-preserving mapping, the single-step Koopman 


operator can be expressed in the form e’“, where CL is self-adjoint when 
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considered in the space of square-integrable functions in the phase space. 
However, the Pollicott-Ruelle theory deals with a larger space of singular 
distributions (see below), in which £ is no longer self-adjoint (sec. 9.5.8.1 
considers the case of the baker’s transformation). In the present section, we 
confine our attention to Hamiltonian flows, while results for mappings can 


be formulated in analogous terms. 


The discrete spectrum is made up of the eigenvalues e“’"' of K;, where w,, 
are the real eigenvalues of the generator £. As mentioned above, a purely 
discrete spectrum implies a quasi-periodic behavior, corresponding to an 
integrable Hamiltonian. A quasi-periodic motion is confined to a torus 
corresponding to specified values of the relevant action variables and toa 
given value of the energy, and is described in terms of a set of frequencies 
Q),Q2,--- ,Q, (where s stands for the number of degrees of freedom), in 
which case the eigenvalues w,, are linear combinations of the ’s. The 
condition for the motion to be ergodic is that the eigenvalue w,, = 0 is to 
be non-degenerate, which requires that the frequencies 2),9),--- ,Q; are 


to be incommensurate. 


The above condition for ergodicity continues to hold even when the mo- 
tion is not quasi-periodic: the motion of the representative point in the 
phase space is ergodic if and only if the eigenvalue w,, = 0 of the Liouville 
generator L (corresponding to the eigenvalue 1 of the Koopman operator) is 


non-degenerate. 


On the other hand, the system is characterized by mixing dynamics if 


and only if the eigenvalue 1 of the Koopman operator (corresponding to the 
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eigenvalue !“! = 0 of the Liouville operator) is non-degenerate and is, in 
addition, the only eigenvalue in the discrete spectrum, which implies that 


the spectrum is necessarily a continuous one. 


Generally speaking, a mixing system is defined by the property of decaying 
correlations at large times for all pairs of observables A, B, as indicated in 


sec. 9.4. 


While considerations based on the real continuous spectrum lead to the 
conclusion that the correlation functions decay with time, the detailed 
nature of the decay is obtained from the extension to complex frequen- 
cies. Here the focus is on the existence of complex poles of the spectral 
function S(w) (corresponding to a correlation function C(t) in the time do- 
main) referred to as resonances. These poles, termed the Ruelle-Pollicott 
resonances (at times referred to as the Ruelle resonances for the sake of 
brevity), are analogous to the ones that are found in the study of spectral 


functions in the theory of stochastic processes. 


The resonances arise as singularities of the function S(z) obtained from 
S(w) by continuation into the complex domain, from which the correlation 


function C(t) is recovered as 


1 


Se 


C(#) e "9 (z)dz, (9-74) 


where the integration is to be performed along the real axis, but can be 
appropriately deformed to a contour depending on the singularities of 


S(z). For t > 0, S(w) can be continued below the real axis while, for t < 0, 
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the continuation is possible to values of z above the real axis. If S(z) has 


a simple pole at z = p + iq (q > 0), then C(t) decays as e*“e~’”’ where the 


upper (resp. lower) sign corresponds to ¢ > 0 (resp. t < 0). 


In the case of an Axiom-A system, for which a natural measure in the form 
of a Gibbs state exists (e.g., the SRB measure), Pollicott proved that the 
spectral function S4g(w) is meromorphic (i.e., analytic except for poles) in 
the strip Imw < 6 for some 6 > 0, the position of the poles being independent 
of A,B. The residue at a pole can be expressed as a product of two factors 
pertaining respectively to A, B. The proof of meromorphicity makes use of 
the thermodynamic formalism, drawing upon the analogy of time correla- 
tions in an Axiomm-A system and spatial correlations in a one-dimensional 
spin system with exponentially decaying interactions. The mixing condition 
excludes the existence of poles on the real axis of the complex frequency 


plane. 


In the case of systems of more general types for which the spectral func- 
tion possibly possesses branch cuts along with poles, the time-dependence 
of correlation functions is more complex though, once again, the decay at 


large times persists, as it should for a mixing system. 


Exact results on the existence of Ruelle-Pollicott resonances are rare, though 
numerically they have been shown to exist in model systems. A few concrete 
results exist, for instance, for the baker’s map (see sec. 9.5.8.1), and hyper- 


bolic maps on the 2-torus. 


The locations of the resonances of the spectral functions, which is an in- 
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trinsic property of the dynamics of a system (since these are independent 
of the observables A, B), can be related to the Generalized eigenvalues of 
the Frobenius-Perron operator [52] where the eigenfunctions are distri- 
butions rather than square integrable functions in the phase space (see 
sec. 9.5.8.1). In this larger space of singular distributions, the Koopman 


operator no longer appears as a unitary one. 


The spectral decomposition for the expectation value of an observable 


appears in the form (refer to (9-17a)) 


t-00 


(A)i ~ (Alpeq)(Pealeo) + Ye (Alpi) (pilo) ++ (9-75) 


a 


where the Dirac notation is used for phase space functions (which, in 
general, may be singular distributions) and their scalar products. In this 
expression, the first term on the right hand side depicts the contribu- 
tion of the equilibrium state p., (the eigenfunction of ¥, corresponding to 
eigenvalue 1) to the expansion, while the second term represents the sum 
of the contributions of the simple poles (of the spectral function) with 
eigenvalues e~ 7" (Re(7;) > 0,7 = 1,2,---) of the Frobenius-Perron operator. 
The dots on the right hand side stand for possible contributions of mul- 
tiple poles and branch cuts of the spectral function. The vectors |p;) are 
the right eigenvectors of the Frobenius-Perron operator, 


Filpi) =e" |i), (oil Fy =e F*(p;| (@ = 1,2,---), (9-76) 


where we recall that F; is no longer unitary in the enlarged space of sin- 


gular distributions in which the generalized eigenfunctions appear [52]. A 
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simple instance of such an expansion appears in the Hamiltonian motion 
of a particle moving over a potential barrier with a single maximum [53]. 
One finds in this case a single decaying eigenfunction for t — oo, which is 
a singular distribution concentrated on the unstable manifold emanating 


from the single unstable point in the phase space. 


9.5.8 Baker’s map, Arnold map, and Smale horseshoe 
9.5.8.1 The baker’s transformation 


The baker’s transformation is a map of the unit square in the 2D Eu- 
clidean plane (to be referred to as the x-y plane), and constitutes a simple 
dynamical system (lacking in continuity) of great heuristic value in sta- 


tistical mechanics ([29], chapters 7,8). It is described by the mapping 


(2,9) + (2,9) = (ay!) = (2n, ¥), (for 2 < 5) 


(and) = (2¢—-1, ae (for x > >). (9-7 7a) 


(2,9) + O"(x,y) = (a!) = (5, 2y), (for y < 5) 


(and) =("F*,2y— 1), (for y > 5). (9-776) 


Fig. 9-4 depicts the way the baker’s map transforms the two halves, x < } 
and x > 5, of the unit square where the transformed halves now make up 


the regions y < 5 and y > 5 of the same unit square. The transformation 
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is shown in two steps where, in the first step the square is compressed 
vertically into a rectangle of height 4, and then, in the second, the rect- 
angle is cut into two, of which the left half remains in its place while the 
right half is placed on top of it whereby the unit square is reassembled. 
These imagined operations resemble the way a baker kneads his dough. 


¥ ¥ Y 


x 


Figure 9-4: Depicting schematically the baker’s map transforming the unit 
square onto itself; the transformation is imagined to be made up of two steps 
where, in the first step, the square is compressed vertically into a rectangle of 
height 5, and then, in the second, the rectangle is cut into two, of which the 
left half remains in its place while the right half is placed on top of it whereby 
the unit square is reassembled; these imagined operations resemble the way a 
baker kneads his dough. 


The x and y co-ordinates of a point in the unit square can be represented 
as infinite sequences of 0’s and 1’s by means of their dyadic expansion, 


of the form 


1 1 1 
© = 0.a1a2a3°++ = ay X gt eae (5)" + a3 x Car 
1 1 1 
y = 0.bybob3--- = by X = + be x (=)? +.b3 x (=)P+---, 
2 2 2 
lai, bi € {0, 1} ¢= 1,2,3,°*= |, (9-78a) 


and the two can be combined into a single bi-infinite sequence with a dot 


separating the two halves of the latter so as to represent the point (z, y) 
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as 


(z, y) i b3b9b1.a1a243 soe (9-78b) 


The baker’s transformation then corresponds to a left shift in this bi- 


infinite sequence whereby the dot is shifted one place to the right: 


(x,y) > (a',y’) > +++ bgbebyay.aga3°-- . (9-78c) 


This describes a Bernoulli shift mentioned in sec. 9.5.5.4, and is equiv- 
alent to a stochastic process generated by a random coin toss. In other 
words, though the baker’s transformation corresponds to a deterministic 
dynamical system (analogous to one described in terms of a Hamiltonian 
in a phase space) it is actually equivalent to a completely random coin 


toss; this comes about due to its discontinuous nature. 


The baker’s map, being a discontinuous one, is not a diffeomorphism, 
but possesses a number of properties of Anosov diffeomorphisms, and 
is referred to as an Anosov map. It resembles a Hamiltonian system in 
being area-preserving and invertible, and is characterized by a natural 
measure that coincides with the Lebesgue measure, i.e., any reasonable 
initial distribution leads asymptotically to the microcanonical distribu- 
tion in the weak sense. The latter is a consequence of the fact that the 
Baker’s map is mixing (and hence ergodic). The mixing property can be 
shown to be a corollary to its equivalence with a Bernoulli shift. Periodic 
orbits are dense in the unit square, corresponding to recurrent bi-infinite 


sequences. 
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Given any point (z,y) within the unit square, the stable and unstable 
directions under the forward mapping are seen to be parallel to the y- 
and the x-axis respectively (the two get interchanged under the reverse 
mapping), resulting in mutually perpendicular stable and unstable man- 
ifolds,, corresponding to which the Lyapunov exponents are found to be 


Ax; = +ln2, the sum of the two being zero owing to the area-preserving 


property of the map. The topological entropy of the baker’s map is In2, 
as is its KS entropy, the latter conforming to the Pesin identity (for- 
mula (9-63b)). The KS entropy is obtained by considering a partition 
of the unit square in two halves by the line x = $ and following the pre- 
scription mentioned in sec. 9.5.5.7. This turns out to be a generating 


Markov partition (refer to sec. 9.5.5.4) for the map. 


The approach to equilibrium as a function of the forward iteration num- 
ber (n) can be worked out by looking at the Ruelle resonances for the 
map [65]. The Frobenius-Perron operator (¥) for a single iteration of the 
map is unitary and, for an initial probability distribution po, gives the 
nth iterate p, aS p, = F"po. Referred to the space of square-integrable 
functions on the phase space, F has an infinitely degenerate continu- 
ous spectrum on the unit circle which, however, does not provide specific 
information on the mode of approach to equilibrium. When considered 
in an enlarged space of distributions, the spectrum of F has a part ly- 
ing outside the unit circle. Defining formally a self-adjoint operator £ as 
F =e“, Lis seen to possess resonances in the lower half of the complex 
plane, in addition to a real continuous spectrum, the resonances being 


located at z,. = —im|n2 (m = 1,2,---), where z,, is an m-fold degener- 
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ate eigenvalue of £ (we exclude the eigenvalue z) = 0, which corresponds 
to the eigenvalue 1 of the operator F). The leading eigenvalue (m = 1) 
describes the approach to equilibrium as p,, ~ e~"'* pp. Generally speak- 
ing, the Ruelle resonances are related to the Lyapunov exponents, as is 


evident in the present instance of the baker’s map (z,, = —imA,). 


The approach to equilibrium takes place by stretching and folding, typical 
of mixing systems. An initial distribution (p9) gets stretched along the x- 
axis and compressed along the y-axis under forward iterations and, after 
a large number of iterations, tends to a distribution that is smooth along 
the unstable manifold and an irregular one on a very fine scale along 
the stable manifold, i.e., an initial set of a large number of points gets 
distributed over a large number of lines parallel to the x-axis, spaced 


densely along the y-axis. 


Figure 9-5: Depicting schematically the result of a number of iterations of the 
baker’s map on an initial probability distribution concentrated on a small region 
of the unit square (not shown); regardless of the initial distribution (assuming 
that it is not an exceptional one), one gets a distribution concentrated on thin 
strips parallel to the x-axis, the direction of the unstable manifold; on further 
forward iterations of the map, the strips get stretched and thinned, and folded 
back into the unit square, resulting in a finely spaced irregular distribution 
along the direction of the stable manifold; as the number of iterations goes to 
infinity, the distribution approaches a uniform one in the weak sense. 
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The baker’s map constitutes a simple model of Hamiltonian chaos. 


9.5.8.2 The Arnold map 


The Arnold map (commonly referred to as the ‘Arnold cat map’, or the 
‘cat map’ in brief) is an instance of a class of maps termed toral automor- 
phisms. A toral automorphism is a transformation on the 2-torus, any 
point on which is specified by a point (z,y) within the unit square with 
opposite faces identified (one can also use two angles as co-ordinates, 
each in the range 0 to 27, where 27 is identified with zero). The map is of 


the general form 


(x,y) + (2’,y!) = (lax + by), [ex + dy) (a,b,c, d integers), (9-79a) 


where the symbol [-| means ‘modulo 1’. A commonly used form of the cat 


map corresponds to a = 2,b = 1,c=1,d =1, which can be represented as 


= ‘ (mod 1). (9-79b) 


The cat map can be visualized as a transformation of the unit square in 
a number of elementary steps. The square is sheared one unit up in the 
xz-y plane, then two units to the right, and finally all its parts that have 


moved out of the unit square are moved back by the ‘modulo 1’ operation. 


Like the baker’s map, the cat map is area preserving and invertible, and 
is chaotic. It is an Anosov diffeomorphism, and is mixing (and hence is 


ergodic), the natural measure being uniform on the torus. The Lyapunov 
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indices are the logarithms of the eigenvalues of the 2 x 2 matrix occurring 


in (9-79b) and are given by 


a 
A+ = In 5 


(9-80) 


periodic points of the map are those with rational fractions as their x- 
and y- co-ordinates, and cover the torus square densely. The map is 
transitive (there exist dense trajectories). In virtue of being an Anosov 
map, it is structurally stable. The topological entropy and the KS entropy 
of the cat map are both given by 


Reap = hrs =In Ay. (9-8 1) 


The stable and unstable directions at any point in the unit square in the 
x-y plane (in which the map acts as a linear transformation with periodic 
boundary conditions) are mutually orthogonal straight lines (parallel to 
the eigenvectors of the 2 x 2 matrix in (9-79b)) whose angles of inclination 
to the x- and y-axes are irrational multiples of 27 . An initial probability 
distribution in the unit square (where a set of exceptional distributions is 
excluded) gets compressed along the stable direction and stretched along 
the unstable direction. The periodic boundary conditions, along with 
the geometry of the stable and unstable manifolds ensures that, after a 
large number of iterations of the map, the initial distribution is smoothly 
stretched along the stable lines and have a finely spaced irregular distri- 
bution along the unstable directions, as in the case of the baker’s map. 


Once again, as the number of iterations n goes to oo, the initial distribu- 
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tion approaches the uniform equilibrium distribution in a weak sense. 


The cat map is exceptional in the sense that the approach to equilibrium 
is faster than exponential in virtue of hidden symmetries of number the- 
oretic origin. However toral automorphisms constructed out of the cat 
map by adding small perturbations on it are more generic in this respect, 
and their Ruelle resonances can be located by invoking various approxi- 
mation schemes [76]. The leading resonance (analogous to z, in the case 
of the baker’s map, see sec. 9.5.8.1) is then found to imply an exponential 


approach to equilibrium. 


Fig. 9-6 shows a partitioning of the torus (equivalent to the unit square 
with periodic boundary conditions) into a number of regions by means 
of the stable and unstable manifolds issuing from the origin (the stable 
manifold is shown as issuing from the point (1,0) since it is inclined at an 
obtuse angle with the x-axis). This constitutes a generating Markov parti- 
tion of the torus and can be used to work out the KS entropy. Generating 
Markov partitions are useful in the construction of the SRB measures 


defined on Axiom-A attractors (sec. 9.10.3). 


9.5.8.3 The Smale horseshoe 


The Smale horseshoe is a measure-preserving transformation of the unit 
square (other geometries of the phase space are also possible) where, 
however, some regions in the square are mapped outside while some oth- 
ers remain within the square ([29], [106], [62]), i.e., points escape under 
forward and backward iterations, as if repelled by points that remain in- 


side. In the following, we do not specify the map in concrete terms, but 
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a 


ee 


Figure 9-6: Depicting a generating Markov partition (schematic, not to scale) 
for the Arnold cat map (eq. (9-79b)) defined on the unit square with periodic 
boundary conditions (the 2-torus); boundaries of the various regions formed 
by parts of the stable and unstable manifolds (refer back to sec. 9.5.5.4) are 
marked ‘s’ and ‘u’ respectively; the partition is made up of four regions marked 
with numerals; each of these regions is a connected one on the torus, though 
it appears unconnected on the unit square (connection is established by the 
periodic boundary conditions); finer Markov partitions can be generated by the 
forward and backward actions of the mapping on the partition shown. 


indicate its action on the unit square in giving rise to a repeller. 


Imagine the unit square being stretched along the x-axis and compressed 
along the y-axis, both by a factor 7 > 2 (we assume 7 to be sufficiently 
large so as to obtain fig. 9-7), so as to be transformed into a thin rectangle 
that is then bent in the form of a ‘U’ and placed on the square as in fig. 9- 
7(A), giving rise to two strips whose points lie both in the square and in 
the bent rectangle. This constitutes a forward application of the map. A 
backward application consists of a similar stretching and compression, 
each by a factor 7, but now the stretching is along the y-axis and the 
compression is along the x-axis so that on bending and placing on the 
square (fig. 9-7(B)), one now obtains four small rectangles that remain 


in common within the original square, deriving from the two strips ob- 
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tained under forward iteration, and the pair of strips resulting under the 
backward iteration. The points within these four small rectangles, each 
of which is reduced by a factor 7? compared to the original square, are 
the ones that remain within the original square after one forward and one 
backward iteration, the remaining points being ‘repelled’ so as to ‘escape’ 


from the square. 


Under a second application of the forward and backward iterations, each 
of the four small rectangles that survive within the unit square in the 
earlier stage yields four new rectangles, each reduced once again by a 
factor 7? compared to its predecessor, making a total of 47 = 16 small 
rectangles in all that survive the second stage. After a large number of 
repeated applications of such stages of forward and backward iterations, 
one obtains a surviving set that resembles a direct product of two Cantor 


sets (see sec. 9.5.9). 


The Smale horseshoe is of central relevance in the theory of dynamical 
chaos. If there occurs a transversal intersection of a stable manifold 
and an unstable manifold in the phase space of a flow, there results a 
structure in the phase space that is equivalent to the one described above 
(generated by an infinite number of forward and backward iterations of 
the Smale horseshoe) and is referred to as a homoclinic or a heteroclinic 
tangle. This is the way islands of chaos are produced in the phase space 
within families of regular trajectories produced in the flow. However, the 
chaotic motions within the tangles will not concern us in the present 
context. The tangles acquire relevance in the context of Arnold diffusion. 


The likely role of the latter for systems made up of a large number of 
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(A) (B) 


Figure 9-7: Depicting schematically the application of a forward and backward 
application of the Smale-horseshoe on the unit square; (A) under a forward map- 
ping, the unit square is elongated along the x-axis by a factor 7 (> 2) and com- 
pressed along the y-axis by the same factor; the resulting rectangle is bent into 
an ‘U’ and placed on the square so as to produce two thin strips where it over- 
laps with the latter; (B) the backward mapping involves a stretching along the 
y-direction and a compression along the x-direction, each by a factor of 7 once 
again; when the bent rectangle is placed on the figure obtained in (A), there re- 
sults an overlap region made up of four small squares, each of area +4; points in 
these squares survive a single stage of forward and backward application of the 
map; after an infinite number of such stages, the surviving points constitute the 
repeller, which is a fractal (refer to sec. 9.5.9) set equivalent to the direct product 
of two Cantor sets, while the remaining points escape from the unit square. 


particles will be briefly mentioned in the concluding section of this book 


(sec. 10.5). 


Fractal repellers of the type outlined above assume relevance in the es- 
cape rate formalism, to be briefly outlined in sec. 9.6 below, where one 
establishes useful relations between transport properties of an open sys- 
tem and its dynamical features expressed in terms of the KS entropy, 
Lyapunov exponents, and the escape rate. In the case of the repeller 
described in the present section, generated by the action of the Smale 


horseshoe on the unit square, the last three are related to each other as 
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(recall formula (9-69)) 


y =Inn—In2. (9-82) 


The dynamics on the repeller is chaotic, since it can be shown to be 


equivalent to a Bernoulli shift as in the case of the baker’s map. 


9.5.8.4 Projected dynamics and chaos: a toy model for the Boltz- 


mannn equation 


As mentioned in sec. 8.3.4 the Boltzmann equation can be looked upon 
as representing a ‘reduced’ or ‘projected’ dynamics resulting from the 
full phase space dynamics described by the Liouville equation, where 
the projection is conveniently done with reference to the BBGKY hierar- 
chy. Such instances of projected or ‘contracted’ description of a complex 
system abound in the entire subject of statistical mechanics where one 
extracts relevant information in a given context while ignoring — in some 
appropriate sense - a mass of ‘irrelevant’ information. The success of the 
reduced model depends on the proper choice of ‘relevant’ as distinguished 
from ‘irrelevant’ information in the given context, and on the effectiveness 


of the method adopted in bringing about the projection. 


In the present section we introduce a simple and interesting model ([29], 
chapter 7) based on the baker’s map, where the projection is easily de- 
scribed and visualized, thereby providing insight as to how a contracted 
description can lead to a Boltzmann-like evolution towards an equilib- 


rium distribution. In this example, the baker’s map itself may be looked 
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upon as a simple low-dimensional model mimicking a real-life macro- 
scopic system, or as a contracted description of a high-dimensional sys- 
tem with only the ‘relevant’ description extracted from the latter, with a 
further contraction being applied — aimed at illustrating how a contrac- 


tion brings about an evolution toward an equilibrium state. 


Referring to the baker’s transformation and its inverse (equations(9-77a), 
(9-77b)) in the unit square (the ‘phase space’ in the present context), 
the evolution of a probability distribution p(z,y) in the unit square is 


described in discrete time (n — 1 — n) as as (refer back to) 
Pn-1 7 Pn: Pl y) = pn—1(®*(x,y)), (9-83a) 
i.e., 


a ; 1 
Paltsy) = Pn—1(=, 2y) (if y< 5) 


2 
z+] 
(and) = Pont 


,2y—1) (ify> 5) (9-83b) 


We now consider a projection operation in the space of probability distri- 
butions where the dependence of p(z,y) on y is integrated out in order to 


arrive at the reduced distribution function 


W(x) = i dypalt,y) 


3 1 1 
=) dypna (5-28) +f ypns(——, 2 = 1). (9-84a) 
0 ; 


On changing the integration variable as y — 2y in the first integral on the 
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right hand side and y — 2y — 1 in the second integral, one obtains 


Le 1 
W(x) = al [>n—1(59) + ora(— ,y)| 
1 44 
= 5 [Waa(S) + Woa()] (9-84b) 


This is the model ‘Boltzmann equation’ for the reduced distribution func- 
tion W,,(z) obtained by projection from the ‘Liouville equation’ describing 


the evolution of the full probability distribution p,(z, y) 


Recall how the single-particle distribution function was obtained by inte- 
grating out the redundant degrees of freedom from the full time dependent 
probability distribution in the phase space; the Boltzmann equation de- 
scribes the time evolution of the reduced single-particle distribution func- 


tion (refer back to sec. 8.3.2. 


It is easily seen that the equilibrium solution for this reduced distribution 
function is W,,(#) = 1. The question then arises as to whether an initial 
reduced distribution differing from W., approaches the latter with the 
passage of discrete time (n) and whether this approach can be character- 
ized in terms of a monotonically varying entropy-like function, analogous 
to Boltzmann’s H-function (refer back to sec. 8.3.5). This, indeed, is triv- 


ially true. Thus, defining 
1 
Hi, = dxzW,,(x) nW,,(2), (9-85a) 
0 


and making use of the evolution equation (formula (9-84), W,,_; > W,,) of 
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the reduced probability distribution W,,(x), one can check that 


Aaa a Tins (9-85b) 


where the equality sign holds only when the reduced distribution coin- 


cides with W,,. 


In order to arrive at (9-85b), one has to make use of the convexity of the 


function F(x) = zInz, which implies $(alna + bInb) > “4% In“* (a,b > 0). 


The time evolution of an initial probability distribution W(x) along the 
x-axis can be explicitly worked out in this simple example of a projec- 
tion, when one can see that the initial distribution does indeed go to the 
limiting distribution W., as the discrete time n is made to go to ov, the 
rate of decay to W,., being exponential (~ 2~"). It is of interest to note 
that the time scale in which the projected distribution approaches equi- 
librium (W,,) is distinctly short compared to the time required for the 
full distribution (p(z,y)) to approach its own equilibrium, corresponding 
to a uniform distribution in the unit square. This feature distinguish- 
ing projected distributions from the full phase space distribution is seen 
to persist in other simple instances of projection such as the Arnold cat 
map, and seems likely to persist for more complex systems characterized 


by the Anosov property ([29], chapter 17). 


In the case of the projection from the baker’s map along the x-axis, one 
notes that the latter is the unstable direction for the map. The question 


arises as to what may happen if the projection is taken along some other 
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direction. The case of the Arnold map is relevant since, in this map, the sta- 
ble and unstable directions are inclined at an angle to the Cartesian axes 
with reference to which the map is defined. It turns out that the approach 
to equilibrium does not depend on the direction along which the projection 
is taken so long as it does not coincide with the stable direction for the map 
(the y-axis in the case of the baker’s transformation). For any other direc- 
tion of projection, the stretching component along the unstable direction 


becomes of overriding relevance. 


In summary, the baker’s map and its projection illustrate in a simple and 
solvable setting a number of distinctive features of the Boltzmann equa- 
tion, obtained from the Liouville equation by a process of projection. One 
point of difference between the two relates to the fact that no assumption 
resembling the one of molecular chaos seems to have been necessary in 
the baker’s case. This is due to the fact that the baker’s map is itself 
chaotic to start with, whereas no such property is assumed to character- 
ize the phase space distribution of a gas from which the single-particle 
distribution function is obtained. Even though it is likely that the Liou- 
ville dynamics is of the mixing type in the case of an interacting system 
made up of a large number of particles, it is not explicitly built into the 
full phase space distribution from which the single-particle distribution 


function is obtained by projection. 


As for the time scales involved in the process of approach to equilibrium, 
the toy model of the present section points at two distinct regimes — the 


one pertaining to the full probability distribution and the smaller time 
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interval characterizing the projected distribution. In this context, it is 
worthwhile to refer to a physically intuitive classification, according to 
Bogoliubov, of phase space variables as ‘relevant’ and ‘irrelevant’ ones 
([29], chapter 7; the relevance of a similar classification has been pointed 
out by other authors too), and to the analogous distinction between time 
scales of approach to equilibrium mentioned in sec. 9.2.4. It seems to be 
generally true that the ‘relevant’ phase space variables (including those 
describing the thermodynamic states of a system) are ones depending 
on a large number of microscopic phase space variables (i.e., ones de- 
pending on the position and momentum variables of a large number of 
particles) while the ‘irrelevant’ ones depend on only a small number of 
those. With reference to this scheme, it may seem paradoxical that the 
single-particle distribution function (f(r, p,t)) satisfying the Boltzmann 
equation varies over a time scale relevant to the time evolution of the 
hydrodynamic modes characterizing a gas, which is actually the slowest 
time scale in the approach to equilibrium. The paradox is resolved when 
one observes that f is not determined by the variables characterizing any 
specified particle, but refers to all the particles — identical to one another 
— severally, which is why the time scale of variation of f is actually the 


longest one in the approach to equilibrium of the gas as a whole. 


9.5.9 Dynamical chaos and fractals 


The term ‘chaos’ has different connotations in various different contexts. 
In this book, we loosely interpret the term as a feature of dynamical sys- 
tems involving ‘sensitive dependence on initial conditions’ in a compact 


phase space (or an invariant set within it), in which typical trajectories 
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are constrained to develop folds. The combined effect of stretching and 
folding results in a number of features that can be looked upon as char- 
acteristics of chaos. In particular, chaos is of frequent occurrence in 
mixing systems, while an ergodic system need not be chaotic. Axiom- 
A systems are ones that are defined in terms of a number of precisely 
formulated conditions that imply chaos and, in addition, numerous re- 
sults of relevance in statistical mechanics. Generally speaking, chaotic 
systems are characterized by positive values of the Kolmogoroff-Sinai en- 


tropy — at times this is taken to be the defining feature of chaos. 


1. While the mixing property implies a decay of correlations, it leaves 
open the question of the nature of the decay. On the other hand, 
Axiom-A systems generally involve an exponential decay of correla- 
tions, though it is not established that an exponential decay of cor- 
relations necessarily requires Axiom-A properties. Exponential decay 
is commonly observed in systems of interest in statistical mechanics. 


For instance, hydrodynamic modes exhibit an exponential decay. 


2. One can speak in terms of a hierarchy, referred to as the ergodic hier- 


archy [43] that can be schematically depicted as follows: 


ergodic > weak mixing > strong mixing > K-system > Bernoulli sys- 


tem, 


in which a category occurring to the right of another constitutes a 
subset of the latter. For instance, weak mixing implies ergodicity but 
the latter does not imply the former. Referring to the above hierarchy, 
chaos is often associated with K-systems since the latter are necessar- 


ily characterized by a positive KS entropy. 


Chaotic systems often involve the presence of fractals. A fractal is a set 
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having a fractional dimension within a space with specified metric prop- 
erties. We will assume the latter to be a Euclidean space or a manifold 


made up of small patches, each of which is part of an Euclidean space. 


The dimension of a set can be defined in various ways, each correspond- 
ing to a distinct concept of dimension as compared with others. Of partic- 
ular interest in the present context are the box-counting dimension (Dp), 
the Hausdorff dimension (Dy), and the information dimension. One can 
also define a generalized dimension D, depending on a parameter gq, of 
which Dy is a special case, corresponding to gq = 0 [106], while the infor- 


mation dimension corresponds to q = 1. 


The box-counting dimension of a set S is defined in terms of the number 
(N(e)) of hypercubes, each having a side of length «, necessary to cover 
the given set S (i.e., the one whose dimension is to be defined). The cubes 
are assumed to have a dimension D equal to that of the Euclidean space 
within which the set S resides (for instance, D = 2 in the case of a set S in 
the Euclidean plane, i.e., the hypercubes are small squares in the plane; 


a value D > 2 is also admissible). One then defines 


(9-86) 


In defining the Hausdorff dimension, on the other hand, one first defines 
a D-dimensional Hausdorff measure of the set S for a variable parameter 
D, where D may be any positive real number. We define the diameter of 


a set as the largest separation between a pair of points in it, among all 
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possible pairs. Let us consider a finite or countably infinite covering of 
the set S by means of small sets s; (i = 1,2,---), each having a diameter 


€; < 6 for some positive 6, and define 


ml?!(6) = in oe (9-87a) 
where the right hand side stands for the minimum of the sum indicated, 
evaluated over all coverings {s;} of the type specified. The D-dimensional 


Hausdorff measure is then defined as 


ml?) = limsoml?!(6). (9-87b) 


It turns out that there exists a Dy > 0 such that, looked at as function of 


the parameter D, the measure me? Thasa sharp transition from me? l= 9 


for D < Dy to mi? 1_Q for D> Dy. This transition value Dy across which 
the D-dimensional Hausdorff measure drops from co to 0 then defines the 


Hausdorff dimension of S, related to the box-counting dimension as 
Do > Du. (9-87c) 


While Dj and Dy are geometrical features of the given set S, the gener- 
alized dimension D,, defined for the variable parameter q¢ is a dynamical 
feature related to the natural measure (j) defining a dynamical system 
such that S lies in the phase space of that system. Let us consider a cover 
of the set S with cubes of side « each, there being N(c) of such cubes in 


the covering. If the natural measure of the ith cube (i = 1,2,--- , N(e)) be 
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ji, then D, is defined as 


N(e 
1 li In ee HE 


D,= m 
1—q«0 Ine 


qd 


(9-88) 


The idea underlying this definition is that, for g > 0, boxes with a larger 
natural measure receive greater weight in determining the dimension. 
Among the generalized dimensions D,, the one corresponding to q = 1 
is referred to as the information dimension, which satisfies the bound 


D, < Do. 


The generalized dimension D, can also be defined with reference to any 


measure other than the natural measure of a dynamical system. 


Fractal sets are found to occur in the theory of dynamical chaos as attrac- 
tors and repellers of dynamical systems of various descriptions. For in- 
stance, typical repellers representing steady states of open systems with 
absorbing boundaries are fractal sets analogous to the one produced by 
the Smale horseshoe. Commonly occurring Axiom-A attractors repre- 
senting steady states of thermostated systems (refer to sec. 9.9), detected 
in numerical experiments, are also found to be fractal sets in the phase 
space. For most of these fractals, one finds that the box-counting dimen- 
sion and the Hausdorff dimension are equal (Dy = Do). However, sets 


with D, > Dy are not difficult to construct. 


Cantor sets are often found to feature in fractals of various descriptions. 
A Cantor set in an interval on the real line is a non-denumerable set of 


measure zero, and of non-zero Hausdorff dimension. For instance, the 
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middle-thirds Cantor set is produced, first by scooping out the middle one- 
third part from the interval, then scooping out the middle one-third from 
each of the two surviving parts, and repeatedly following it by the same 
process of scooping out the middle one-third part of each of the intervals 
that survive the previous stage. In the limit of an infinite number of 
applications of this process, the remaining set contains no interval (an 
interval being a set of non-zero measure), but is an uncountable one. 
The box-counting dimension and the Hausdorff dimension of the middle- 


thirds Cantor set are equal, given by ( [29], chapter 11). 


— In2 


oo: oe ie 


(9-89) 


In the case of the repeller generated by the Smale horseshoe as outlined 
in sec. 9.5.8.3, which is a direct product of two one dimensional cantor 


sets (we assume that 7 >> 2), one obtains, likewise, 


2In2 
DSS 


; (9-90) 
Inn 


Fractal sets make their appearance in various different contexts in non- 
equilibrium statistical mechanics, especially in the description of phase 
space structures that appear in non-equilibrium steady states, includ- 
ing the associated invariant measures. They are found to be involved in 
the structures of generalized eigenfunctions of the Frobenius-Perron op- 
erators associated with the Ruelle-Pollicott resonances. These also make 
their appearance in transport theory of dissipative systems where the 
transport coefficients themselves are found to have a fractal structure 


when looked upon as functions of physical parameters characterizing the 
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transport [29]. Fractal structures have been found to characterize the 
eigenfunctions of the Frobenius-Perron operator describing the hydrody- 


namic modes of systems (briefly outlined in sec. 9.7 below). 


In other words, the spectral theory over the complex domain provides a 
direct link between processes such as diffusion in an infinite medium 
and the microscopic Liouville dynamics in the phase space. Associated 
with the Ruelle-Pollicott resonances in the complex plane, there occur 
generalized eigenvales and eigenfunctions of the Frobenius-Perron oper- 
ator, where the latter appear as singular distributions rather than regular 
functions in the phase space. These singular structures have been explic- 
itly constructed in simple models of systems extended in real space ([52], 


chapter 7). 


Fractal sets arise in the context of chaotic scattering and the associated 
scattering theory of transport, in the form of repellers made up of trapped 


trajectories, as briefly outlined in sec. 9.6 below. 


Non-equilibrium steady states of an infinitely extended system are also 


seen to possess fractal features, as outlined in sec. 9.8. 


9.6 Chaotic scattering and the escape rate for- 
malism 


Scattering systems are ones where trajectories escape to infinite dis- 
tances. In numerous situations of practical interest the scattered trajec- 


tories turn out to be uncorrelated with the incoming ones in spite of rel- 
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atively simple deterministic laws governing the scattering process. This, 
for instance, is the case for a particle scattered by a number of hard disks 
ina plane. Thus, in a three-disk scatterer, an incoming trajectory is either 
scattered away or trapped for ever within the disks, thereby providing an 
instance of chaotic scattering where the escape rate is related with the 
dynamical entropy (specifically, the KS entropy) characterizing the chaos 
and the Lyapunov exponents of the repeller on which the trapped trajec- 
tories are confined. In the case of a large number of scatterers confined 
within a large volume, the process of escape may be described in terms of 
a diffusion equation in a region with absorbing boundaries, for which the 
diffusion coefficient can be related to the escape rate, thereby providing 
an instance where a transport coefficient is determined by features of the 


microscopic chaotic dynamics. 


The relation between the escape rate, dynamical entropy, and the Lya- 
punov exponents of the repeller is obtained from (9-69) (formula (9-82) 
constitutes a particular instance for 7 >> 2). The derivation of this for- 
mula can be illustrated by referring to a simple model of a 1D map- 


ping [29] on the unit interval described by two parameters (po, pi; > 0), 


1 
ga! =O(r) =— for0<a<- 
Po 2 


Pi 


1 
[0 < po, pi < 57 Po+ Pi < 1], (9-91a) 


there being a discontinuity at x = $ as in the baker's map. For this simple 
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dynamical system, points in the interval pp < « < 1—p, escape outside the 
unite interval under the application of the map ®, while the remaining 
points are trapped within the interval. Under repeated applications of 
the mapping, parts of the remaining intervals are similarly scooped out 
and, as the number of iterations (n) approaches oo, the set of trapped 
points approaches a Cantor set (being the middle-thirds Cantor set in the 


special case po = p; = 3). 


The total length of the surviving intervals after n iterations is found to go 


to zero exponentially for n > co 


ln = (potpi)* =e ™, (y=—In(po+ 71)), (9-9 1b) 


i.e., y = —In(pp + pi) is the escape rate for the problem. 


The KS entropy and the Lyapunov exponents are defined in terms of the 
natural measure ,: on the repeller which is a singular one, being the 
limit (as n — coo) of a measure concentrated on the intervals surviving 
n iterations (2” in number). The total length of the surviving intervals 


being (po + pi)", the measure on the ith surviving interval of length 1; 


(i = 1,2,--- ,2") is »; = —“... With reference to this natural measure 1, 
(po+p1) 


one obtains the Lyapunov exponent of the repeller (see [29], chapter 11, 


for details), 


1 
A=- [po In po + pi ln pil, (9-92a) 
Po + Pi 
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while the KS entropy works out to (refer to (9-91b)) 


hrks = In(po + pi) +A= A-7,; (9-92b) 


which is the required relation in the present context. 


Each point on the unit interval is characterized by an escape time, i.e., 
the number of iterations after which it is expelled from the unit interval 
(the average of all these escape times with respect to the natural measure 
is y) which turns out to be highly singular function with a self-similar 


structure. 


The box-counting dimension (Dp ) of the repeller can be worked out in a 


straightforward way and is sen to be given by the transcendental equation 


po +p = 1, (9-93) 


(note that this reduces to Dp = pe in the special case pp = p; = i as it 


should, since the repeller is then the middle-thirds Cantor set). 


All the above features relating to the chaotic dynamics of the system (viz., 
the escape rate, the Lyapunov exponent, and the KS entropy), can be ob- 
tained from one single quantity, namely, the topological pressure, defined 
in terms of the dynamical partition function (refer to formula (9-51b), which 
is written in the case of a hyperbolic flow). Recall that the pressure Ps; was 
obtained from the more general P(A) for the choice (9-58) of the phase space 
function A. In the present instance of the 1D map, the dynamical measure 
wa (refer to (9-53)) for this special choice of A reduces to the natural mea- 


sure y referred to above for 6 = 1, and the dynamical partition function is 
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given by 
ae (p6 p77 : (9-94a) 


from which the topological pressure is obtained as (refer to (9-52)) 


Pg = lim ~ In Zn (8) = In(pp + p?). (9-94b) 


n—- co 


Once the topological pressure is known, all the characteristic quantities of 
dynamical chaos can be obtained as in the case of equilibrium thermody- 
namics where all thermodynamic parameters are related to the free energy 


by means of appropriate derivatives: 


d 
—, Pa] 


lap (9-94c) 


d 
y= —Pe=, d= — [a9 F 6) pa hes = Pe=i — B=1 


which is consistent with the second equality in (9-92b). This tells us that the 
dynamical partition function or, equivalently, the topological pressure is the 
feature of central relevance in dynamical chaos just as the free energy (or, 
depending on the context, the Gibbs free energy, or the Grand potential) is 
of central relevance in equilibrium thermodynamics — a fundamental result 


in the thermodynamic formalism for chaotic dynamical systems. 


In order to relate the above considerations with a transport process, we 
consider a situation in which a particle moves among a fixed set of scat- 
terers in a region bounded by two planes x = 0, « = L while being un- 
bounded along the other directions, where the separation (L) between 
planes is assumed to be large. For the sake of concreteness, we as- 
sume that the scatterers are hard spheres (or disks in the case of two- 


dimensional motion). The planes at x = 0,x = L are assumed to be ab- 
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sorbing ones so that once the particle hits either of the planes, it does 
not return, i.e., it escapes from the region (for the sake of consistency, 
the escape probability has to go to zero for L — oo). This makes the prob- 
lem formally analogous to the case of the 1D mapping considered above, 
where the motion among the fixed set of scatterers results in a fractal 
repeller representing the trapped trajectories (for the sake of convenience 
of interpretation, one can imagine an infinite set of particles all mov- 
ing independently among the scatterers) that survive escape through the 
boundaries. The repeller is characterized by a natural measure (which 
coincides with the dynamical measure for @ = 1), with reference to which 


the characteristic features of microscopic dynamical chaos are related as 


y= >> —bks, (9-95) 


rA4>O 


where each of these quantities depends on the geometry of the fixed scat- 


terers within the region under consideration. 


We now look at the same problem in terms of the probability density 
P(r,t) within the region (0 < x < L) occupied by the scatterers. In the 
limit of the number of scaterers going to infinity, the particle undergoes 
a diffusive motion when considered over a large time interval, and the 


probability density satisfies a diffusion equation 


ap, 
Gy = DV'P, (9-96a) 


where D stands for the diffusion coefficient and where the boundary con- 
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ditions at the absorbing planes read 


P(x =0,t) = P(x =L,t)=0. (9-96b) 


The solution for the probability distribution along the x-direction, ob- 
tained by a separation of variables, can be written down in the form of an 


eigenmode expansion as [29] 


oo 22 
= Gi sin ( exp ( — Dt), (9-96c) 


n=1 


where the a,’s are a set of coefficients that depend on the initial proba- 
bility distribution. The slowest decaying mode (analogous to the the so- 
called ‘hydrodynamic modes’ of a spatially extended system) corresponds 


to n = 1, giving the ‘macroscopic’ escape rate (7’) as 
y= —D, (9-97) 


so that P(x,t) ~ e~7”' for large t. Equating the escape rates (7, 7’) obtained 
in the two approaches one arrives at 


= = ihe = (e N= hiKe), (9-98) 


IL-oo 77 
rA>0 


where the limit ZL — oo is necessary to represent the motion among the 
scatterers as a diffusion problem. Thus, for the sake of consistency, the 
expression ()/,..)i — hxs) for a given scattering geometry with a finite 
L (in which case the problem appears as one of escape from a fractal 


repeller) has to have a ;; dependence, which is indeed what is found in 
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numerical experiments. The above formula has been verified numerically 
in the case of an irregular distribution of the scatterers as well as a pe- 
riodic distribution (the random and periodic ‘Lorentz model’) in two and 


three dimensions. 


This approach for the determination of transport coefficients relating to 
systems of the type considered above (non-interacting particles stream- 
ing through fixed scatterers) can be generalized to obtain other transport 
coefficients as well, such as shear viscosity and heat conductivity, by 
making use of the so-called Helfand moments defining these coefficients 
(refer to [52], chapter 6, for details; see sec. 9.7 below for a brief outline 
of the direct link between diffusive processes and the Liouville dynamics 


in the phase space). 


9.7 Spatially extended systems: spectral the- 
ory of hydrodynamic modes 


In spatially extended systems, the relaxation to equilibrium is governed in 
the long run by the hydrodynamic modes (refer back to section 8.2.3.1). 
These modes can be explicitly constructed for non-interacting particles 
moving through periodically extended arrays of scatterers as in the es- 


cape rate formalism of sec. 9.6 above. 


A simple instance of this approach is provided by the infinitely extended 
periodic Lorentz gas of hard sphere scatterers in which a point particle (or 


a stream of independent particles) moves by successive scattering events. 
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The probability density of finding the particle anywhere within the lattice 
can be Fourier-transformed into a set of functions p*), where k labels 
a continuum of quasi-periodic density functions, such that p“) for each 
specified k evolves by the Frobenius-Perron operator Fi) obtained by the 
Fourier transformation from the operator ¥; in the position space. One 


can expand p) in terms of the eigenfunctions of Fi) defined as 
Foye = en sktgy(k) (9-99a) 


where s, determines the time evolution of the eigenmode, and where the 


eigenfunction satisfies the quasi-periodic boundary condition 
TMP) = eitky,® (9-99b) 


T being the operator corresponding to the translation by a lattice vector 
q in the periodic array of scatterers. For each k, one will have a spectrum 
of Ruelle-Pollicott resonances, among which the hydrodynamic mode is 
identified as the one satisfying the condition lim, ,9 s, = 0, since the hy- 
drodynamic mode describes the relaxation to a uniform stationary state. 
The dispersion relation satisfied by this mode is of the form (refer back 


to (8-24d)), 


s, = Dk? + O(k"). (9-100) 


With the hydrodynamic mode 7 thus identified with reference to the 


spectra of the Ruelle-Pollicott resonances, one finds that it typically has 
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a fractal structure in the phase space with a cumulative function (defined 
as an appropriately constructed integral over the singular function 7) 
that is continuous but non-differentiable. The diffusion coefficient defin- 
ing the hydrodynamic mode is then obtained in terms of the Hausdorff 
dimension of a certain fractal curve determined by the cumulative func- 


tion ([52], [29]). 


The hydrodynamic modes have been explicitly constructed and shown to 
have fractal cumulative functions in the hard disk and Yukawa potential 
periodic Lorentz models, as well as in a multibaker model of diffusion [52], 


[53). 


9.8 Liouville dynamics and non-equilibrium steady 
states 


Imagine a region of space, containing a distribution of scatterers, bounded 
by two porous walls at x = 0,“ = L and unbounded along the y- and z-axes 
of a Cartesian system. We assume that particles are injected through the 
wall at x = 0 at a constant rate, while there occurs an outflux through 
the wall at x = L, so that a constant concentration gradient is maintained 
along the x-axis. This causes a steady state to be set up, with a diffusion 


current maintained along the x-axis. 


The steady state problem differs from the diffusion problem in an in- 
finitely extended isolated system with no external flux, in which case an 


initial concentration profile decays exponentially to zero (the equilibrium 
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state), though the same diffusion constant (D) governs the macroscopic 
dynamics in the two cases, provided that both take place sufficiently 
close to the equilibrium configuration of a uniform concentration. As 
mentioned in sec. 9.7 , the hydrodynamic diffusive mode in the evolution 
towards equilibrium is related to Ruelle-Pollicott resonance of the spec- 
tral function, associated with a singular distribution in the phase space 


arising as the leading eigenfunction of the Frobenius-Perron operator. 


In the steady state problem, on the other hand, the diffusive eigenmode 
remains regular as long as the system is confined within a finite interval 
along the x-axis. In the limit Z — oo, on the other hand, the eigenmode 
satisfying the relevant boundary condition reduces to a singular distribu- 


tion with fractal properties. 


Results on simple models of diffusive processes in infinitely extended 
systems confirm that non-equilibrium steady states close to the equi- 
librium configuration indeed involve singular distributions in the phase 
space [52] (the cumulative function of the singular measure has fractal 
features). Analogous results are expected to hold for non-equilibrium 
steady states far from equilibrium. What is more, the crucial feature of a 
positive entropy production rate is also seen to emerge as a consequence 


of the singular nature of the relevant eigenmode. 


All this goes to show that non-equilibrium steady states and the asso- 
ciated thermodynamic feature of a positive entropy production can be 
traced back directly to the Liouville dynamics of a system, in which a 


non-equilibrium steady state is distinguished from an equilibrium con- 
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figuration by the boundary condition at large distances, while the local 


dynamics in the phase space remains Liouvillian in both the cases. 


This approach of linking the features of a non-equilibrium steady state 
with the Liouville dynamics in the phase space at the microscopic level 
is to be compared with the one based on the so-called ‘chaotic hypothe- 
sis’, without direct reference to underlying Liouville dynamics, when the 
SRB distribution is found to emerge, residing on a fractal attractor in the 
phase space (refer to sec. 9.10 below). The latter approach accommodates 
far-from-equilibrium steady states as well as near-equilibrium ones, and 
links the positive entropy production rate with the phase space con- 
traction resulting from the dimension loss of the non-equilibrium prob- 
ability measure owing to the fractal nature of the attractor. It is not 
far-fetched to surmise that multiple chaotic scattering is the basic phe- 
nomenon underlying the singular and fractal features associated with the 


non-equilibrium steady states. 
As mentioned in sec. 9.5.7, the occurrence of the Ruelle-Pollicott resonances 


is presumably associated with chaotic properties of a system (recall that the 


resonances can be demonstrated to exist for an Axiom-A system). 
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9.9 Non-equilibrium steady states in thermostated 


systems 


9.9.1 Thermostats: Introduction 


In sections 9.6, 9.7, we had a look into non-equilibrium configurations 
of systems described in terms of Hamiltonian dynamics, where the vari- 
ous non-equilibrium configurations were realized by means of boundary 


conditions on the systems under consideration. 


In the present section we consider systems under the action of exter- 
nal fields (or gradients introduced by means of boundary flux condi- 
tions), where non-equilibrium steady states are maintained by the action 
of thermostats compensating the effects of the fields tending to disturb 


the steady states. 


However, the thermostats under consideration are not actual physical 
systems, but are simulated ones, whose effects are introduced by mod- 
ifying the microscopic equations describing the system dynamics. This 
is of particular relevance in non-equilibrium molecular dynamics (NEMD) 
studies that have revolutionized the field of investigations into the non- 
equilibrium behavior, over an immensely broad range of situations, of 
systems of diverse descriptions. Typical NEMD studies are computational 
ones with a simulated interaction with an external field. The latter, how- 
ever, pumps energy into the system which thereby gets heated up (as 
indicated by an unbounded increase in the average energy of the parti- 


cles making up the system, or of the temperature, appropriately defined 
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in the non-equilibrium context), ruling out the establishment of a steady 
state. In an actual experiment, a steady state is established by bringing 
the system in contact with a thermostat that essentially acts as an in- 
finitely large reservoir quasi-statically drawing away the energy pumped 
into the system by the external field, the result of the joint action of the 
field and the thermostat being a steady flux set up in the system. A 
similar situation is simulated in an NEMD study by modifying the micro- 
scopic equations describing the system dynamics such that the constraint 


of constant energy is realized. 


The effect of constraints acting on a mechanical system results in the 
appearance of forces of constraint in the equations of motion. The latter 
can no longer be derived from a Hamiltonian (though, in numerous sit- 
uations, a Hamiltonian description can be obtained in terms of a set of 
non-canonical variables) and acquire a dissipative character where the 


conservation of phase space volume does not hold. 


In this context, we recall the case of a particle interacting with an oscillator 
bath, outlined in sec. 8.7.2.2. As one works out the reduced equations 
of motion of the Brownian particle by projecting out the bath variables, a 
friction term makes its appearance along with a random force term. In 
the present context, the effect of the thermostat (required for constraining 
the energy to a constant value) appears in the form of a set of terms in 
the equations of motion, as a result of which the equations assume a non- 


Hamiltonian and dissipative character. 


As a simple illustration of the use of thermostated equations of motion 
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we consider a system of N particles described by the equations 


where the forces F; include those due to external fields acting on the 
system. The effect of a thermostat can be described in terms of a modified 


set of equations of motion as 
mir; = F; — amr; (@ = 1,2,--- ,N), (9-101b) 
where the damping constant a is chosen to satisfy 
a= Sas (9-101c) 


(notation: m;,r;, mass and position vector of the ith particle; F;, force on 
the ith particle due to interaction with other particles of the system and 
due to external fields (i = 1,2,--- ,.N); check that the above choice for a 


results in a constant value of the total kinetic energy ($ >>; mir?)). 
9.9.2 Digression: Constrained motion 


More generally, a constraint is described by an equation (non-integrable 


in general) of the form 
S— Aj: dr; + 5B; - dp; + Cat = 0, (9-102a) 
where we have introduced the momenta p; = mir; (i = 1,2,---,N) in the 
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place of velocities, and where A;,B;,C are specified functions of the po- 
sitions and momenta (and possibly of time). Denoting by F; the forces 
acting on the particles in the absence of the constraint, the actual forces 
will be of the form F; — G; where the constraint forces (—G;) are to be 
determined. This is conveniently done by making use of Gauss’ princi- 
ple of least constraint which states that the actual rates of change of the 


momenta (p;) will be such as to minimize the ‘curvature’ 
1 
= —_(p, — F,)? -102b 
c Do in i)’, (9-102b) 
subject to the constraint (9-102a), which can be expressed in the form 


So gi Bi =4, (9-102c) 
gi,q being specified functions. Thus, introducing the undetermined mul- 


tiplier a, one obtains the set of equations 


O 
ao + ad gjP; — q)| =9, (9- 102d) 


with C as in (9-102b). This gives the constraint forces as 
G; = aM Zi, (9-103a) 


where the multiplier a is given by 


_ diFi-8i-¢ 


ae (9-103b) 


a 
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Once the G,’s are determined from (9-103a), (9-103b), the equations of 


motion are obtained as 


pi = F; — Gi, (9-103c) 
In the special case of a constraint where the kinetic energy )/, ue is re- 
quired to be constant, one obtains 
q=0, ==, (9-104a) 
Mi 
and thus 
F;-pi 
a= m2 ? 
3 FiPi p, 
Ges — (9-104b) 
FI 
JI my 


in agreement with (9-101b), (9-101c). 


9.9.3 Gaussian isokinetic and isoenergetic thermostats 


In setting up the thermostated equations of motion it is useful to separate 
the forces F; (i = 1,2,--- , N) into the internal and external ones, where the 
internal forces (Fm) are derivable from a scalar pair potential ¢(r) (r = 
separation between the particles making up the pair) while the external 


forces (Fi) are commonly of the form D;F“*"), D; being the coupling 


constant of the external field to the ith particle. 
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The Hamiltonian in the absence of the external field is thus given by 


H= e+ DOD oll — ral). 


i<j J 


(9-105) 


The thermostat model commonly used in a typical NEMD experiments 


can be of either the Gaussian isokinetic (GIK) or the Gaussian isoenergetic 
(GIE) type, corresponding to the constraint conditions 


2 
a 


(GIK :) S- > = constant, 
(9-106) 


a 


(GIE :) S- us + S- S- o(ri —r,|) = constant. 
' j 


i<j 


a 


Invoking the principle of least constraint, the equations of motion are 


found to be the form 


. Pi 
DS) 
™; 
(9-107a) 


D; =sgm) al Fe) — ap; (i = iM 2, ithe ,N), 


where the multiplier a is given, in the two cases, by 


1 ; 
[DoF ri) +5 FO], 


1 Pi in ex 
(GIK:) a= ea hE ) + FI ae 
(GIE:) a= j Fe, (9-107b) 


(9-107c) 
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One notes that, in the isoenergetic case, the constraint force vanishes in 
the absence of the external field. This is consistent with the observation 
that, with F‘**) = 0, one does not need a constraint to ensure that the 


energy remains constant. 


While the isokinetic and the isoenergetic constraints are widely invoked in 
NEMD simulations, various other constraints (Such as the isobaric, iso- 
choric and isoenthalpic ones) are also invoked, depending on the context. 
In particular, the Nosé-Hoover thermostat, briefly outlined in sec. 9.9.5 


below, is extensively used in NEMD studies. 


As a consequence of the introduction of the thermostat term in the equa- 
tion of motion (eq. (9-107a)), the system becomes non-conservative since 
the phase space volumes are no longer conserved as seen in sec. 9.9.6 
below where we find that, compared to the Hamiltonian time evolution in 
the absence of the external field and the thermostat, the Liouville equa- 
tion gets modified. In consequence, the equilibrium probability distribu- 
tion (i.e., the stationary distribution in the absence of the external field) 
also gets modified. Finally, in the presence of the external field along with 
the thermostat, one obtains a stationary state described in terms of an 
attractor since, for a large class of systems of interest, the average rate of 


change of the phase space volume turns out to be negative. 


In the next section (sec. 9.9.4) we consider the thermostated system in 
the absence of the external field (F“**) = 0) and work out the invariant 


distribution characterizing the system. 
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9.9.4 thermostated systems: Equilibrium phase space 
distribution 
In the case of a GIE thermostat, we have seen that the dissipation factor 


a = 0 in the absence of the external field. In other words, the system is 


characterized by the microcanonical distribution on the energy surface. 


In the case of a GIK thermostat, on the other hand, it is the kinetic energy 


(K = Sy PL 


constant (K = Ko, say), and the equilibrium distribution can be factored 


) rather than the total energy that is constrained to remain 


into distinct distributions in the momentum space and the configuration 
space. The momentum space distribution is just a uniform one on the 
surface kK = Ko, while the configuration space distribution can be seen to 
be a canonical one involving the potential energy (® = )7,_, 7, o(/ti — rij) 


with an effective temperature given by 


2Ky 


Tee = ———__, 
ane 


(9-108) 


since the number of degrees of freedom is 3N —4 (the three components of 
the total momentum, and the total kinetic energy being conserved quanti- 
ties). The equilibrium distribution function (i.e., the invariant distribution 


in the absence of external fields) is thereby given by 


coy 


e tet 6(K — Ko) 


= 5 (9-109) 
f dXe *e%w 5(K — Ko) 


Peq 


where the denominator provides for the normalization. 


In the presence of the external field and the constraint introduced by 
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the thermostat, the system seeks out a steady state characterized by 
a singular distribution, caused by the phase space contraction resulting 
from the dissipative dynamics. This we will turn to in sec. 9.10 below. 
As outlined in sec. 9.11, it is the fractal structure of the attractor in the 
phase space associated with the singular distribution that provides the 


basis for the irreversible entropy production in the steady state. 


9.9.5 The Nosé-Hoover thermostat 


The Nosé-Hoover thermostat is widely used in molecular dynamics stud- 
ies since the equilibrium distribution it leads to is the canonical one in 
the phase space of the system under consideration (recall that the GIK 
thermostat produces a canonical distribution only in the configuration 
space; refer back to sec. 9.9.4). This is achieved by assigning an addi- 
tional degree of freedom (s; this differs from the symbol s having been 
used earlier to denote the number of degrees of freedom of a system) to 
the thermostat (instead of representing the latter by a phase space depen- 
dent parameter a as in the GIK and the GIE thermostats). Making use 
of the additional degree of freedom corresponding to the thermostat, one 
starts from a Hamiltonian for the extended system (i.e., the system un- 
der study together with the thermostat) where the momentum variables 
for the system and the time (p;,t) are replaced with simulated variables 
pi,7 (i = 1,2,--- , N) defined in terms of a transformation relating the real 
and the simulated variables (see, for instance, [104]). This Hamiltonian 
includes three additional parameters Q,7,g, where Q stands for the ef- 
fective mass for the thermostat variable s, T for the temperature desired 


to be realized by the thermostat, and g is set equal to 3N — 1. 
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The extended system being closed and Hamiltonian, the equilibrium dis- 
tribution it leads to (without an external field) is a microcanonical one. In 
an actual calculation, one transforms back from p;,7 to the physical mo- 
mentum variables p; and the real time ¢t, while s is transformed to a new 
dynamical variable ¢ that constitutes a generalization of the parameter a 
of the GIK and GIE thermostats. The equations of motion obtained from 


the resulting Nosé-Hoover thermostat read 


x DE & » 2K /K(p) 
y= mM,’ Pi = F; CPi C= Q ( Ko 1); (9 110a) 
where 
; +1 
K(p) => a es > hel. (9-110b) 


These equations represent the isokinetic version of the Nosé-Hoover (N-H) 


thermostat, while an isoenergetic version can also be set up. 


9.9.6 Thermostated systems: Phase space contraction 


The introduction of the thermostat term in the equations of motion (eq. (9-107a)) 
causes an alteration in the Liouville equation describing the evolution 
of the probability density p(X,t) in the phase space,where X stands for 
the collection of the phase space co-ordinates (r,,1r2,--- , ty, P1,P2,°°: , Py). 


The equation of continuity of the probability density reads 


® a5 J(X) =0 @-111a 


where Vx - J/(X) stands for the divergence (with reference to the phase 
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space co-ordinates) of the probability current density 


Vx JX) = [5 (oh) + 5 (0B) (9-111b) 


On using the equations of motion (9-107a), this equation is found to 


appear in the form 


Op | Op a es + Py, 
at ela ts, D; : bi] =-e DI be.” =a (9-112a) 


In the case of a Hamiltonian system (with Hamiltonian H) the expression 
on the right hand side vanishes, while the left hand side reduces to the 
total time derivative ®(= 8p + {p, H}), wherein the above formula reduces 
to the Liouville equation. For a thermostated system, on the other hand, 


we obtain 


= =p). i {OP (9-112b) 


The summation on the right hand side (taken with the minus sign) rep- 


resents the phase space contraction rate , (say) where one can now write 


Or; Op; Op; 


dp _ 
dt PX; 
Or; | Oi O(api 
R= =p. = (oP) (9-112c) 


the last equality being obtained from (9-107a) for a GIK or a GIE thermo- 


stat. 


While the thermostat makes the system under consideration a non-conservative 
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one, the dynamics remains reversible as in the case of a Hamiltonian sys- 


tem. Thus the transformation r; > r;,p; > —pi,t ~ —t (i = 1,2,---,N) is 
seen to leave the equations (9-107a) invariant. This feature of reversibil- 
ity in the non-Hamiltonian equations of motion is in contrast with the 
irreversibility of the Langevin equation derived by the elimination of the 


bath variables from a Hamiltonian particle-bath coupled system. 


Thermostated systems have been widely used in NEMD studies for the 
investigation of non-equilibrium steady states (NESS) under far-from- 
equilibrium conditions, and for the determination of transport properties 
under these conditions. For a large class of systems of interest charac- 
terized by dynamical chaos, the phase space contraction rate turns out 
to be positive in virtue of dissipative processes, and the NESS is typically 
described by a singular stationary distribution concentrated on an attrac- 
tor characterized by a fractal dimension (more precisely, the attractors 
are multifractals, with more than one different fractal dimensions defined 
from geometrical and dynamical points of view, such as the box-counting 


dimension and the information dimension mentioned in sec. 9.5.9). 


The average phase space contraction rate depends on the Lyapunov expo- 
nents since the latter define the expansion and contraction rates around 
any given point in the phase space. More precisely, for systems specified 
in [123] (these include diffeomorphisms defining Anosov systems, and 
also a class of discontinuous mappings characterized by stretching and 


folding in the phase space, as in the case of the baker’s map) the phase 


1351 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


space contraction rate is given by 


(x) =- SOA), (9-113a) 


where the right hand side stands for the average of the sum of all the 
local Lyapunov exponents of the system (taken with a minus sign). In the 
case of an ergodic measure describing a non-equilibrium steady state, 
the averaging on the right hand side is not necessary since the Lyapunov 
exponents are constant almost everywhere on the attractor, and one can 


write 


(x) = - dx. (9-113b) 


The phase space contraction rate for a thermostated system can be iden- 
tified with the rate of entropy production (see sec. 9.11.2 below) due to the 
dissipative processes taking place in it and is, in turn, related to its trans- 
port properties. The case of electrical conductivity for a non-interacting 


Lorentz gas will be briefly outlined in sec. 9.9.7 below. 


While we have referred to the phase space contraction rate for a flow, 
analogous statements can be made in the case of a mapping ¢. Recall 
that the map may represent the action of an underlying flow over the 
time interval between successive mappings. The phase space contraction 


rate is now given by 


x(X) = —Indet(D¢(z)), (9-114) 
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where D¢(X) stands for the derivative of the map at X represented, with 
respect to phase space co-ordinates by the matrix with elements Hes : 
and ‘det’ for the determinant of this matrix. As established in [123], this 


can be expressed in terms of the Lyapunov exponents as in (9-113b). 


9.9.7 The thermostated Lorentz gas 


The Lorentz gas, made up of a particle (or a stream of non-interacting 
particles) undergoing elastic reflections from a fixed array of scatterers, 
was introduced as a simple model to describe electrical conduction in a 
periodic solid and other diffusion-like phenomena, and has more recently 
been investigated as a low-dimensional system providing an instance of 


non-equilibrium steady states when coupled to a thermostat. 


Fig. 9-8 depicts a 2D periodic Lorentz gas made up of a 2D periodic array 
of fixed scatterers which we assume to be hard disks. Consider a sin- 
gle particle (or a stream of non-interacting particles) moving through the 
scatterers. During the motion within the array of disks, it suffers elastic 
collisions and follows an erratic course where we assume that the density 
of the scatterers is larger than a certain minimum such that trajectories 
without collision don’t exist. The de-focusing action of the specular re- 
flections against the disk boundaries results in a diffusive motion of the 
particle. If a large number of non-interacting particles be released within 
the array of scatterers, the density of particles evolves in accordance with 
a diffusion equation, characterized by a diffusion coefficient depending on 
the density of the scatterers. While the figure depicts a periodic arrange- 


ment of the disks on a triangular lattice (with a hexagonal Wigner-Seitz 
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cell), other periodic arrangements (such as the one based on a square 
lattice) are also possible. Lorentz gas models with a random disposition 


of the scatterers have also been widely studied. 


B29, oe 
Sot 


Figure 9-8: Depicting schematically the two-dimensional periodic Lorentz gas 
with circular hard disks placed on a triangular lattice; the hexagonal Wigner- 
Seitz cell is shown in (A), which also shows a part of a trajectory of a particle 
(schematic) undergoing successive reflections from the boundaries of the disks; 
in the close-packed configuration (not shown), where the minimum separation 
between disks is zero, there is no diffusion; (B) the case of infinite horizon, 
where the minimum separation w between disks, each of radius a, is larger than 
wo = (=. — 2)a, and the motion of the particle is ballistic, when the diffusion 


coefficient is infinite; in between, with 0 < w < wo, the motion is diffusive; in this 
intermediate regime, there occurs a phase space contraction on to a fractal at- 
tractor corresponding to a steady state that arises under the action of an applied 
field and a compensating thermostat; based on [78], fig.3. 


The 2D or 3D Lorentz gas is known to be a K-system (refer back to 
sec. 9.5.9), and is hence mixing and ergodic, which is why the particle 
suffers a diffusive motion among the scatterers (recall the connection be- 
tween the diffusion coefficient and the escape rate from a finite but large 
array of the scatterers outlined in sec. 9.6). The thermostated Lorentz gas 
is widely studied as a model to investigate non-equilibrium steady states 
(NESS) under the influence of external forces and to explore the asso- 
ciated transport properties where the model may admit of interactions 


among the particles moving through the scatterers. 
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The non-interacting Lorentz gas constitutes a useful model for the study 
of electrical conductivity. We assume that each of the particles moving 
among the scatterers has a charge qg, and is under the action of the ex- 
ternal field E, there being no other force acting on it. In this case the 
GIK and GIE thermostats do not differ from one another, and the ther- 


mostated equation of motion reads (refer back to (9-107a), (9-107b)) 


. Pp 
aa 
m 
p = gE — ap, (9-115a) 
where 
-E 
eo (9-115b) 


(check this out). 


We now work out the phase space contraction rate for this system. Since 
the latter is constituted of just a single particle, one obtains (see for- 
mula (9-112c)) 

0 0 q(E-p)p 


a a! aa p? 


a. (9-116a) 
For the 2D Lorentz gas under consideration, this evaluates to 


E.- 
(oS (9-116b) 


(check this out). 
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For a large class of chaotic systems, the average phase space contraction 
rate equals the rate of entropy production (or, simply,the entropy produc- 
tion, Sproa, Expressed in units of kp; refer to sec. 9.11.2 below) due to 
the irreversible dissipative processes. In the case of the non-interacting 
Lorentz gas made up of charged particles under the action of an electric 
field, the entropy production per particle in a steady state is related to 
the Joule heating as 
ip-E_ cE? 


= -117 
Sprod T T ’ (9 ) 


where jz stands for the steady electric current, o for the electrical conduc- 
tivity, and T for the temperature that can be defined for the steady state 
in terms of the average kinetic energy. In other words, for the Lorentz gas 
under consideration, the electrical conductivity per perticle is obtained 


from 


oF? 
kp(a) = TT 
kpT (a) 

i€., = a (9-118) 


where the average (-) is taken with respect to the steady state distribution 
which, significantly, is a singular one residing on a fractal set in the phase 
space. Indeed, the non-zero phase space contraction rate is caused by the 
fact that an initial regular distribution evolves towards a singular one in 


the steady state. 


The above formula, (9-118), expresses a macroscopically defined trans- 


port coefficient, namely, the electrical conductivity, in terms of (a) ((= 
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(y)), refer to the first equality in (9-116b)), the average (taken with respect 
to the steady state phase space distribution) of a microscopic feature of 
the system dynamics. This is analogous to results in linear response the- 
ory where one relates the macroscopically defined transport coefficients 
to averages of microscopically defined quantities, evaluated with respect 


to the equilibrium phase space distribution. 


In our derivation of the expression of the phase space contraction rate (in 
terms of the parameter a) and electrical conductivity we have not explic- 
itly taken into consideration the specular reflections of the particle at the 
disk boundaries. The results arrived at are, however, valid in spite of this 
apparent omission. The dynamics of the particle involves free motion be- 
tween the successive reflections, and the change in the direction of motion 
at each reflection, and can be conveniently described as a mapping in a two- 
dimensional space made up of the so-called Birkhoff co-ordinates (¢, sin 6). 
In this representation, 6 stands for twice the angle of incidence in a reflec- 
tion and ¢ for the azimuthal angle of the point of incidence with reference 


to a fixed direction. 


The phase space contraction takes place as a result of the local expan- 
sions and contractions in the phase space (refer back to sec. 9.9.6). In 
the present instance of a single particle moving through fixed scatterers 


there are two non-zero Lyapunov exponents (A, > 0, A_ < 0), and one has 


(x) = —(A4 + -), 
kpT (Ay + A_) 
_ a 


(9-119) 


Co = 
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As mentioned earlier, the Lyapunov exponents are constants in the case 
of an ergodic distribution on the steady state attractor (as in a SRB dis- 
tribution; SRB distributions will be introduced in sec. 9.10), and the av- 


eraging on the right hand side is redundant. 


For a conservative system, the sum of the local Lyapunov exponents is 
zero everywhere in the phase space. For a non-conservative system with 
many degrees of freedom, such as an interacting Lorentz gas, the non- 
zero Lyapunov exponents can be grouped into pairs where each pair is 
made up of one positive and one negative exponent, the sum of the two 
being the same for all the pairs (the so-called ‘pairing rule’ [104], see 
sec. 9.10.2). 


Referring back to formulae for the entropy production and electrical con- 
ductivity (equations (9-118), (9-119)), it may be mentioned that, strictly 
speaking, the formula for the entropy production in terms of currents 
and affinities (the first equality in (9-117)) holds in the near-equilibrium 
regime where the principle of local equilibrium, underlying the formu- 
lation of non-equilibrium thermodynamics can be assumed to be valid 
(refer back to sec. 8.2). The transport coefficients too are defined in the 
close-to-equilibrium regime where the thermodynamic currents can be 
assumed to be linearly related to the respective affinities. In this sense, 
the expression for the electrical conductivity (and other transport coeffi- 
cients obtained in NEMD studies of thermostated systems) is to be taken 


as a definition in the far-from-equilibrium domain. 


The transport coefficients defined in the near-equilibrium linear regime 
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are, by definition, required to be independent of the relevant affinities. In 
particular, in the case of electrical conduction, co is to be independent of 
E at any given temperature. In the far-from equilibrium regime, the elec- 
trical conductivity defined phenomenologically by invoking the entropy 
production formula of non-equilibrium thermodynamics is not expected 
to be independent of E, and one indeed finds that the conductivity for a 
non-interacting Lorentz gas is rather strongly field-dependent. What is 
more, the range of field strengths for which o appears to be independent 


of E is found to be extremely narrow near |E| = 0. 


This relates to the question of the range of validity of the linear response 
theory, first raised by van Kampen [29]. van Kampen questioned the va- 
lidity of the Green-Kubo formulae on the ground that, in deriving these 
formulae, the ensemble-averaging is performed after the linearization of 
the Liouville dynamics with respect to the applied field (refer back to 
sec. 8.4.4). Strictly speaking, however, the assumption of chaotic dynam- 
ics is inimical to the one of linearity unless one performs an averaging 
over an ergodic ensemble before invoking linearity. It appears, however, 
that the validity of the Green-Kubo formulae can be taken for granted 
over a substantial range of the field strength once the chaotic nature of 
the system dynamics is taken into account in terms of the natural mea- 
sure characterizing it. At the same time, the assumption of a chaotic 
dynamics with a natural measure in the phase space appears to require 
that the system under consideration be made up of a large number of 
interacting particles. Looked at from this viewpoint, the observation of a 


rather strong field dependence of the single-particle Lorentz gas in NEMD 
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studies can be explained as a consequence of the low dimension of its 


phase space. 


One other feature of the non-interacting Lorentz gas model is that the 
transport properties, as computed by the use of the various different 
thermostats (the GIK, GIE, or the N-H thermostat, for instance) turn out 
to be different, while one expects, on the other hand, that the various 
thermostating mechanisms should be equivalent in respect of the ther- 
modynamic properties of the system under consideration such as the 
transport coefficients. The deviation from equivalence is a consequence 
of the fact that the non-interacting Lorentz gas is a low-dimensional sys- 
tem, where one also finds numerically that the fractal sets describing 
the steady states of the non-interacting Lorentz gas are sensitive to the 
type of thermostat used and also to the physical parameters such as the 


strength of the external field. 


As mentioned in sec. 9.11 below, the thermodynamic behavior of a ther- 
mostated system, as revealed in the rate of entropy production (é“*"*)), 
is indeed found to be independent of the type of thermostat used in the 
limit of large N, the number of particles making up the system under 
consideration [21]. The requirement of large N also follows from physical 
considerations since the relaxation time to reach a stationary state from 
an arbitrarily specified initial condition (barring exceptional ones) turns 
out to be unacceptably large for systems made up of a small number of 


particles. 
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9.10 Non-equilibrium steady states and SRB mea- 


sures 


9.10.1 Non-equilibrium steady states: introduction 


We classify non-equilibrium problems in statistical mechanics in two ma- 
jor categories: the problem of approach to equilibrium of a closed system 
(S), released from an arbitrarily chosen initial state, and the one of a sys- 
tem (S) through which one or more currents are made to flow by means 
of external systems, as a result of which S approaches a non-equilibrium 
steady state (NESS). The common feature of both the two situations is en- 
tropy production in S, indicative of irreversibility, where the irreversibility 
is a feature of the macroscopic description of its time evolution, in terms 
of expectation values with respect to relevant distribution function, in 
contrast to the microscopic description, the latter being in terms of the 


Hamiltonian equations of motion. 


As we saw in chapter 8 (refer to sections 8.5 and 8.8), the two types of 
problems can be accommodated within the fold of a single theoretical 
scheme in the case of a system close to equilibrium, for which one can 
define an entropy and a rate of entropy production (‘entropy production’ in 
brief). This theoretical scheme is based on the second law of thermody- 
namics along with a linearity assumption, where the latter specifies the 
scope of the theory. It thereby leads to the principles of non-equilibrium 


thermodynamics (also referred to as irreversible thermodynamics (IT)). 


Recall that non-equilibrium thermodynamics rests on the basic assumption 
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of local equilibrium which defines the entropy of a non-equilibrium config- 
uration close to an equilibrium state; within the scope of validity of this 
assumption, one assumes a linear relation between the thermodynamic 
affinities and currents. The results following from this assumption find an 
interpretation in the linear response theory of non-equilibrium statistical 
mechanics where the time-dependent probability distribution in the phase 
space is approximated by a perturbative expression linear in the external 
field (refer to formula (8-124)). It is to be noted that this theoretical scheme 
does not provide an independent explanation of irreversibility but, in fact, is 
based on the in-built principle of irreversibility in the second law. In other 
words, the positive definiteness of the entropy production of the transport 
coefficients is in the nature of an assumption rather than a consequence of 
the theory. The positive definiteness is borne out in theoretical computa- 
tions and experimental observations, thereby confirming the validity of the 


second law. 


The question arises as to whether the explanation of macroscopic phe- 
nomena in terms of the microscopic dynamics — the fundamental aim of 
statistical mechanics — can be extended in a meaningful way to include 
far-from-equilibrium situations as well. In this, there are approaches 
where the two types of problems mentioned above (the problem of ap- 
proach to equilibrium of a closed system and the one of non-equilibrium 
steady state of an open system) are looked at from a unified point of view 
(see, for instance, [52]), and a common explanation of the positive value 
of the entropy production is sought for (see [55]). However, in the present 


section, we consider the NESS problem separately and look for an ex- 
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planation for the positive value of the entropy production in the fractal 
nature of the NESS (the fractality of non-equilibrium states is of basic 
relevance in the approaches adopted in [52] and [55] as well). In this, 
we base our approach on an assumption formulated by Gallavotti and 
Cohen, who made use of certain mathematical and physical consider- 
ations, referred to as the chaotic hypothesis. Like its illustrious prede- 
cessor (the ergodic hypothesis of Bolzmann) it is thought to be of great 
heuristic value, with a considerable potential ability to explain results of 
practical relevance in non-equilibrium statistical mechanics, not neces- 


sarily limited to near-equilibrium situations. 


Whether and to what extent this leads to an extension of the concepts and 
principles of non-equilibrium thermodynamics to far-from-equilibrium 
situations remains an open question. As of now, only the concept of en- 
tropy production finds a precise interpretation in the theoretical scheme 
to be outlined below, while an extension to situations close to a NESS 
(recall that the NESS itself can be far away from an equilibrium configu- 
ration) has been proposed (refer to [125]) where transport coefficients can 


be defined in the context of this new regime. 


9.10.2 Non-equilibrium steady states in thermostated 


systems 


Dynamical features of thermostated systems are routinely investigated in 
NEMD studies covering a wide variety of situations. Of special interest in 


the present context are systems with the following features. 
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(A) The system of interest is made up of a large number of par- 
ticles, (generally) interacting among themselves and with an 
external field. The energetic effect of the latter is balanced by 
thermostat that constrains the dynamics in some appropriate 


Manner. 


(B) The system is dissipative, involving a phase space con- 
traction resulting from the thermostated equations of motion 
(see below; refer back to sec. 9.9.7). As a result of the phase 
space contraction and the constraint imposed by the thermo- 
stat, the system asymptotically approaches a non-equilibrium 


steady state (NESS). 


(C) The system is strongly chaotic. From a mathematical point 
of view, this refers to the feature of hyperbolicity, a class of hy- 
perbolic systems with a set of nice features being the Anosov 
flows and diffeomorphisms (refer to sec. 9.5.5.3). Systems of 
physical interest do not come anywhere close to satisfying the 
mathematical conditions relating to hyperbolicity. However, the 
‘chaotic hypothesis’ to be stated below assumes that the inter- 
actions among the large number of particles constituting the 


system effectively lead to hyperbolicity. 


(D) Finally, the system under consideration is assumed to be re- 
versible. Thus, if the dynamics is described by a map ¢ (whose 
successive iterations correspond to evolution in discrete time) 


then there exists a reversing transformation R such that R? = I, 
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the identity, and R¢R= ¢"!. 


In addition, a system made up of a sufficiently large number of particles is 
expected to be characterized by the feature of smoothness of the Lyapunov 
spectrum [49], which means that, if its Lyapunov exponents are plotted 
in a decreasing (or increasing) order, then the resulting graph resembles 
a smooth function in the thermodynamic limit (V — oo). The spectrum 
of Lyapunov exponents is of fundamental relevance characterizing the 
microscopic dynamics of the system, and all its statistical features are, 
in the ultimate analysis, related to the Lyapunov spectrum including, in 
particular, the rates of phase space contraction and entropy production 
(referred to in sec. 9.9.7 above, and to be further outlined in sec. 9.11). 
Analogous to the case of a system close to an equilibrium configuration, 
one can define appropriate transport coefficients with reference to steady 
states far from equilibrium, as in the case of the electrical conductivity 
of a non-interacting Lorentz gas, and these too are determined by the 


Lyapunov spectrum. 


What is more, a large class of thermostated systems conform to the so- 
called ‘pairing rule’, alluded to in sec. 9.9.7. Briefly, the Lyapunov ex- 
ponents can be grouped in mutually exclusive pairs, where each pair 
contains one positive and one negative exponent, the sum of the two be- 
ing the same for all the pairs; see [104] for a proof pertaining to a system 
with a GIK thermostat, where the internal and external forces are deriv- 
able from a potential. There are 6N — 2 number of Lyapunov exponents 


grouped in pairs, where N stands for the number of particles in the sys- 
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tem (the proof does not require N to be large). 


In the case of a flow, two particular Lyapunov exponents are zero, one cor- 
responding to the constraint introduced by the thermostat, and the other to 
invariance under time translation. These two exponents are excluded from 


the pairing under consideration. 


9.10.3 The ‘chaotic hypothesis’ and the SRB distribu- 


tion 


Boltzmann introduced his ergodic hypothesis knowing fully well that few, 
if any, of the real-life systems of interest in statistical mechanics could be 
demonstrated to conform to it. Regardless, the ergodic hypothesis (subse- 
quently developed into the ergodic theorem of Birkhoff and Neumann) has 
been productive of a vast number of consequences of practical relevance 
that constitute an effective interpretation of all the principal results of equi- 
librium thermodynamics and also of those of non-equilibrium thermody- 


namics in situations where the principle of local equilibrium holds. 


In a similar vein, Gallavotti and Cohen introduced the chaotic hypothesis 
with a view to explaining the features of non-equilibrium steady states of 
systems in the far-from-equilibrium regime that would include, as special 
cases, the thermodynamic behavior in equilibrium and near-equilibrium 


situations. 


Briefly, Gallavotti and Cohen proposed that the Sinai-Ruelle-Bowen dis- 
tribution (commonly referred to as the SRB distribution) can be assumed 


to describe the non-equilibrium steady states of a system. The existence 
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and properties of the SRB distribution was rigorously proved for Anosov 
systems (refer back to sec. 9.5.5.3) that could be extended to Axiom- 
A attractors, and such distributions were explicitly constructed for low- 
dimensional Anosov systems. Though it is out of the question to try to 
prove the existence of the SRB distribution for interacting many-particle 
systems (to say nothing of the actual construction of such a distribution), 
Gallavotti and Cohen proved a fluctuation theorem on the basis of an 
assumed SRB distribution that could be confirmed in numerical experi- 


ments involving interacting many-particle systems. 


The number of particles in a numerical experiment is typically incomparably 
small when assessed against the order of magnitude of the number in a typ- 
ical thermodynamic system. Even so, there is no question of demonstrating 


the existence of the SRB distribution even for such a NEMD system. 


In the following we consider a system defined by a mapping rather than by 
a flow. Considering a flow in the phase space (M) of a dynamical system, 
we first restrict the motion on a compact hypersurface defined by the 
constant energy (or the constant kinetic energy in the case of a isokinetic 
constraint) and then define the mapping by means of a further condition 
corresponding to a recurrent event during the motion of the system. The 
latter may correspond to a crossing (along a specified direction) through 
a fixed surface of section in the constrained phase space, or the location 
of the representative point at fixed intervals of time or, say, successive 
collision events (defined appropriately) between particles. We assume that 


this defines a mapping in a reduced phase space P in which the co- 
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ordinates of a typical point are denoted collectively by X, and that there 
is defined a measure dX induced by the Lebesgue measure in the full 
phase space M. Of greater relevance, however, is the dynamical natural 
measure (assuming that it exists) in P that assigns a weight to every 
volume element dX in proportion to the relative frequency of visit of the 
representative point to this volume element in the course of its motion 
taken over a long stretch of time. The motion would then be ergodic with 
reference to this dynamical measure p(dX) which, precisely, defines the 


SRB measure in P. 


The SRB measure resides on an attractor A in the phase space P. In other 
words, for a volume element dX not in A, one has p(dX) = 0. Such an 
attractor exists for a dissipative Anosov map defined in P. More generally, 
for an Axiom-A system there may be more than one such attractors, in 
which case we refer to the restriction of the dynamics to the basin of 


attraction of one of these. 


The chaotic hypothesis [49]. 


A reversible dissipative macroscopic system in a non-equilibrium stationary 
state can be regarded as an Anosov system for the purpose of calculating 


its macroscopic properties. 


The assumption of reversibility was included by Gallavotti and Cohen in 
analogy with a conservative Hamiltonian system as an essential require- 
ment for deriving the fluctuation relation (sec. 9.12). The chaotic hypoth- 


esis draws its inspiration from the SRB theorem that has a number of 
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variants. Here is a simplified statement (i.e., one lacking in mathematical 
precision) of the theorem in the case of an Anosov diffeomorphism ¢ with 


an attractor A (refer to [147] for details): 


there exists a unique measure py (the SRB measure) concentrated on A 
such that, for a well-behaved function F' defined on the phase space, the 
following equality holds 


1 
lim — 
noo 1 


n-1 
F(¢"(X)) = | Fdp, (9-120) 
2 |, 


for almost all X belonging to the basin of attraction of A; in other words, 


the SRB measure is ergodic. 


An Anosov system or an Axiom-A attractor can be described as being 


‘strongly chaotic’ in virtue of the property of hyperbolicity. 
The SRB distribution [47]. 


In other words, looking upon the successive iterations of the map as 
describing the evolution of a phase point in discrete time, the time average 
of any reasonable phase space function equals the phase space average 
with respect to the SRB distribution y. Another characterization of the 
SRB measure is that it is the asymptotic limit of an initial non-singular 
probability distribution (p(X)) for k — ov, i.e., as the initial distribution 
is propagated in the forward direction through an infinitely long stretch 
of time. The approach to the SRB distribution takes place in the weak 


sense, which gives the alternative definition (9-120). 
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In the case of a dissipative system, the attractor A is, generally speaking, 
a fractal set with a dimension less than the dimension of the phase space 
P and, as mentioned above, the measure p assigns a weight to every vol- 
ume element of A as determined by the dynamics. In consequence, con- 
sidering any point X in A and referring to its stable and unstable man- 
ifolds that run through the attractor densely, the SRB measure is dis- 
tributed continuously along the unstable manifold and is singular along 
the stable manifold since any initial distribution gets stretched along the 
former and compressed along the latter. This implies the following char- 
acterization [50] of the SRB distribution in terms of the expansion rates 
along the unstable manifolds at points within the elementary cells of a 
Markov partition, it being known that an Anosov system admits of such a 
partition (refer to sections 9.5.5.4 and 9.5.8; the dynamics of an Anosov 
system can be mapped to a Bernoulli shift with reference to a generating 


Markov partition). 


A Markov partition has the property that applying the map ¢ on each 
cell of the partition generates the cells of another Markov partition and 
taking all possible intersections of the cells of two Markov partitions, one 
obtains a finer Markov partition. Iterating this process of applying powers 
(o* (k = +1,+2,---)) of the map ¢ and taking all possible intersections, we 
obtain Markov partitions with cells as small as we like. Thus, starting 
from a partition E, we apply ¢* on the cells of this partition for all k in 
the range —J < k < J for some positive integer J sufficiently large and 
look at the intersections of all the cells of all the partitions so obtained. 


We thereby obtain a much finer partition E;. We now label the cells in E,; 
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with the integer index j. Taking an arbitrarily chosen point X;, in the cell 
E,; belonging to E; and choosing a positive (even) integer v(< J), we assign 


a relative weight w,,; to each E; as 


wy = [J (A@*X;,)) 1. (9-121) 


In this expression, A(X) for any given point X on the attractor denotes 
the determinant of the Jacobian matrix representing the map ¢ looked 
at as a transformation from the unstable manifold of X to the unstable 
manifold of ¢X. In other words, A(X) is the multiplier representing the 
expansion of a volume element on the unstable manifold of X under the 
action of the map ¢. The larger is this expansion coefficient, the smaller 
is its inverse ((A(X))~') indicating that, in the course of the motion of a 
representative point in the attractor under repeated applications of the 
map, a smaller fraction of time is spent in the vicinity of X. It may be 
noted that, for a sufficiently fine Markov partition E;, the weight w,,; is, to 
a good degree of approximation, independent of the location of X; within 
E;. By the chain rule of multiplication of the Jacobian determinants, 
the right hand side of (9-121) is seen to be the inverse of the Jacobian 


determinant of ¢” evaluated at ¢~2.X ae 


On normalizing the weights w,;, we get a probability distribution /,,; 
over the cells of E;, with respect to which the average of any well-behaved 
function function F(X) is defined to be 


_ da; Wg F(X) 


[iusl@X)F = Se (9-122) 
j Wj 
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where X, is chosen arbitrarily in the cell #; belonging to the Markov par- 


tition E, (recall the definitions of /,1 from above). 


We now consider the limit J — oo, v > oo, maintaining v < J during the 
limiting process. It turns out that the limit exists, defining the SRB dis- 
tribution 4: we are looking for. This definition of the SRB distribution p, 
which follows from its earlier definition as the weak asymptotic limit of 
any reasonable initial probability distribution p under repeated forward 
iterations of the map ¢, can be explained pictorially in terms of fig. 9-9. In 
this figure, the small squares represent the cells FE of the Markov parti- 
tion E;, among which we choose any cell (corresponding to some index ) 
and a point X in this cell. We now consider a stretch of trajectory (dotted 
line) of length v corresponding to the sequence {¢*X} (k = —4 tok = 4-1). 
As the successive points in the trajectory pass through the various cells 
in the partition, there corresponds a sequence of expansion coefficients 
characteristic of each cell, and the product of all these represents the 
resultant expansion coefficient of a small volume element corresponding 
to this stretch of trajectory centered at X. The normalized weight of the 


cell around X is obtained from this expansion coefficient as er from 


Vv 
a) 
j Wr, 5 


which the measure is obtained in the limit v — oo for an infinitely fine 


partition E,; (J — oo), in which case the cell around X shrinks to X itself. 


The chaotic hypothesis of Gallavotti and Cohen makes the conjecture 
that the macroscopic features describing a steady non-equilibrium steady 
state (NESS) of a many-particle Hamiltonian system subject to an exter- 
nal field and constrained by an appropriate thermostat (such as the GIK, 


GIE, or N-H thermostat) can be calculated by assuming the validity of the 
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Figure 9-9: Depicting schematically the process of arriving at the SRB distri- 
bution from a partition E,; (see text) by a limiting procedure; X is a point chosen 
within any specified cell, from which a sequence of iterates ¢*X is obtained for 
k ranging from —5 to 5 — 1; for each iterate one obtains an expansion coefficient 
characteristic of the cell in which the representative point lies; the product of 
all these expansion coefficients gives the resultant expansion from the initial 
point to the final point centered around X; normalizing this expansion coeffi- 
cient evaluated for all the cells in E;, one obtains a probability measure for the 
cells, from which the SRB measure is obtained in the limit of an infinitely fine 
Markov partition J — oo, by taking the limit v > oo as well; the last named limit 
is necessary since, for finite v, the stretch of trajectory shrinks to a point in the 
limit J > oo. 

SRB distribution, i.e., the average of any well-behaved phase-space func- 
tion F(X) can be evaluated as the limit of the right hand side of (9-122) 


as explained above. 


In the case of a closed Hamiltonian system, the SRB measure reduces to 
the microcanonical distribution over the entire phase space since the ex- 
pansion coefficient represented by the Jacobian determinant of the map- 
ping (derived from the Hamiltonian equations describing the system) has 


the trivial value of unity at all points in the phase space. 


The SRB measure has been constructed explicitly for certain baker’s-like 


maps with and without escape [134]. It is found, for instance, that, in the 
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case of a dissipative baker’s-like map without escape, the SRB measure 
is concentrated on a fractal set (more precisely, a multifractal one) with 
a nowhere differentiable but continuous cumulative function, the latter 
being defined in terms of an integral over the measure. While, strictly 
speaking, such a maps is not an Anosov system because of a lack of 
continuity, one obtains a singular measure that shares with the SRB 
distribution the property of being a unique invariant measure resulting 
as the weak asymptotic limit of an initial probability distribution under 


repeated applications of the map. 


9.11 Non-equilibrium states and entropy pro- 
duction 


Two central questions in the large-scale statistical description of a macro- 
scopic system away from equilibrium concern the definition of entropy 
and the specification of an indicator for the irreversible time evolution of 
the system when looked at from a macroscopic point of view. Here the 
term ‘macroscopic’ refers to a description in terms of long term time av- 
erages or phase space averages (with respect to some relevant measure) 
of appropriately chosen phase space functions, the two types of average 
being identical in the case of an ergodic measure such as the SRB dis- 
tribution. As we have seen, the SRB distribution for a non-equilibrium 


steady state is singular in nature. 
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9.11.1 Hamiltonian time evolution: the paradox of en- 


tropy production 


For the sake of convenience, we start by recalling the Boltzmann-Gibbs 
definition of entropy for the case of a Hamiltonian flow in the phase space 
M described in terms of the Hamiltonian H(X) (recall that X is meant to 
stand for the collection of position and momentum variables of all the 
particles in the system). We denote the evolution operator in the phase 
space by U;, i.e., in time t, X = Xo goes to X,; = U;,Xo. Since the energy 
is conserved under the evolution, it suffices to restrict our attention to 
the energy surface M, corresponding to some fixed energy £. The micro- 
canonical ensemble, concentrated on the energy surface, is given by the 


probability measure 
fe(dX) = —6(A(X) — E)dXx, (9-123a) 


where the normalization constant (0, (the so-called ‘structure factor) is 


given by 
Op = f AX5\H(X) - E) (9-123b) 


For a measure p(dX) (defined on M) possessing a density p(X) (i-e., one 


for which pu(dX) = p(X)dX), the Gibbs entropy is defined (in units of kg) as 
Sp] = -{ p(X) In p(X )dXx. (9-124) 
M 


In this context, we recall the following from earlier chapters. 
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The microcanonical measure juz (dX) of (9-123a) does not possess a regular 
density because of the delta-function singularity. In this case, one can 
define a measure induced on the energy surface Mx by the microcanonical 


measure jig(dX), given by 


pr(dw) = V4 (9-125a) 


where dw stands for the surface element on Mz: 


dw 
— = Oz. (9-125b) 
M, VH 
In other words, pz(dw) possesses the uniform probability density pg = a 
on the energy surface Mg with respect to the induced measure fr(dw), and 


the entropy of the microcanonical distribution is then given (again, in units 


of kp) by 


dw 
s=-f pelnpp=— =InOg, (9-125c) 
Mr VH 


the Boltzmann entropy of the microcanonical ensemble. This result can be 
obtained in a limiting procedure by referring to the energy shell of a small 


‘thickness’ JF such that dwdy(X) = 6X, where y(X) is a locally defined phase 


space co-ordinate perpendicular to the energy surface at X, and dy(x) = oe. 


On considering a uniform probability density in the energy shell and then 


applying (9-124), one obtains 


S=InO-+1ndE£. (9-126) 


In this equation, one has to go to the limit 6E — 0. However, on physical 
grounds, the energy surface is an abstraction, since there is always an un- 
certainty in the energy of the system. In the case of a system made up of an 


enormously large number of particles (NV — oo in the thermodynamic limit), 
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InQ is O(N) (since the entropy is an extensive property), which is much too 


large compared to 6F, on neglecting which we obtain (9-125c). 


Starting with the measure ,, at time t = 0 (with density p(X)), the measure 


after a time ¢ will be, say, ju, = U; and its density will be given by 
RS ee a (9-127) 


where J;(X) stands for the Jacobian determinant of the transformation 
X — U,(X) (check this out; if a volume element dV of the phase space at 
t = 0 is transformed into the element dV (t) in time t, then p(U_,X)dV = 
p(X )dV(t), since the time evolution in time ¢ results in the transformation 


U_.X — X; since — = J,(U_,X), eq. (9-127) follows). 


The Gibbs entropy of p;(X) is given by (we continue to measure entropy 


in units of kp) 
S| p:] = = f eX) In p:(X)dxX, (9-128a) 
from which, using (9-124), (9-127), one obtains 
S|p:] = S[e] + [exon In L.(X), (9-128b) 


(check this out, making a transformation of variable f_,X — X within the 
integral sign). In the case of a Hamiltonian time evolution, which con- 
serves the phase space volume, one has J;(X) = 1 and so S{p;| = Sip}. 
In other words, the Gibbs entropy (which reduces to the Boltzmann en- 


tropy for an equilibrium state) cannot increase during a Hamiltonian time 
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evolution for which, therefore, no indicator of irreversible time evolution 


based on the increase of entropy seems to exist. 


Boltzmann’s way out of this paradox was to introduce the idea of coarse- 
graining, where the phase space is assumed to be made up of small cells 
of finite size, as indicated in sec. 9.2.2. The microscopic state of the sys- 
tem is assumed to be determined by the cell in which the representative 
point resides at any given instant, the probability density within the cell 


being taken to be a constant. 


In this coarse-grained description (assumed to apply to both equilibrium 
and non-equilibrium situations), while the microscopic state at any given 
instant is represented by a cell within which the phase point (now of 
notional significance only) lies, the macroscopic state is represented by 
the collection of cells over which the instantaneous distribution function 
p(X) (once again, of notional significance only) has a non-zero value, and 
the entropy (a macroscopically defined quantity) is to be worked out on 
the assumption that all these cells are equiprobable. In other words, if 
p(X) is spread over W number of cells, and if each cell has a volume « 
in the phase space then the macroscopic state corresponds to p = i 
in each of these cells, while having the value zero for all the other cells. 


Then, invoking (9-124) one obtains the entropy (in units of kg) 


S=InW +lne, (9-129a) 


(check this out). After a time t, one expects on physical grounds that the 


probability distribution will be spread over a larger number (say, W;) of 
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cells, this being an assumption consistent with the ergodic hypothesis. 


The entropy will then be 


S, = InwW, + Ine, (9-129b) 


i.e., S,; > S. In other words, following the coarse-graining approach of 
Boltzmann, one can possibly find a way out of the paradox relating to 
the increase of entropy in non-equilibrium situations, though the above 
considerations lack mathematical clarity and precision. While a version 
of this approach does work nicely in the case of non-equilibrium configu- 
rations close to an equilibrium one, where the assumption of local equi- 
librium holds [60], its applicability to far-from-equilibrium situations is 


less clear. 


9.11.2 Entropy production in non-equilibrium steady states 


We will now have a look at the issue of entropy production in non-equilibrium 
steady states of dissipative systems. In this, we consider a Hamiltonian 
system under the action of (a) external forces that perform work on the 
system, and also of (b) a thermostat whose action is simulated by non- 
Hamiltonian terms in the equations of motion (refer back to sec. 9.9). The 
system attains a steady state in virtue of the constraint imposed by the 
thermostat — a NESS that can be either close to or far from an equilibrium 
state. One can give a precise interpretation to the rate of entropy produc- 
tion in the steady state (see below), thereby arriving at an indicator of 


irreversible processes in the system. One can extend the considerations 
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to non-equilibrium (and non-stationary) states close to a NESS [125] (see 
sec. 9.13 below), where the latter may or may not be close to an equilib- 


rium state. 


However, while this formulation of entropy production in a non-equilibrium 
steady state (NESS) is quite precise and is seemingly productive of con- 
crete results of general validity (see sec. 9.12 below), it is limited in scope 
on two major counts: (i) as mentioned above, it is applicable only to non- 
equilibrium stationary states and to states close to these; thus, there is 
still a wide gap in the specification of a general indicator of irreversibility, 
for instance, one pertaining to a closed system far from equilibrium; and 
(ii) to make things worse, there appears to be no general definition of en- 
tropy itself for non-equilibrium states in general (except for ones close to 
an equilibrium configuration, where the assumption of local equilibrium 
is valid) and for non-equilibrium steady states in particular; it is not cer- 
tain, however, whether and to what extent this is to be taken to constitute 


a limitation as a matter of principle. 


In a dissipative system, the Jacobian determinant J,;(X ) introduced above 
is non-zero and one can work out the entropy gained by the system under 


consideration (we call it ‘S’) . 


Referring to the expression for the Gibbs entropy in eq. (9-124), we write, 


for the rate of change of entropy at time tf, 


. d d O 
5 = Fsio(] =-5 f ax S(o.inp), (9-130) 
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where p;(= p(t)), the evolving probability density at time t, satisfies (9-112a). 


Writing the equations of motion in the compact form 
X = F(X), (9-131a) 


(recall that X stands collectively for the position and the momentum 
vectors rj,p; (i = 1,2,---,N) of the particles making up the system), we 


write (9-112a) in the form 


0 
ee Ven), (9-131b) 
ot 

(check this out; note that X and F' are 6N-component vectors each in the 


phase space). Making use of this formula in (9-130), one obtains 


0 
S= ~ [ ax S(pmpyax 


= : dX pV - F(X), (9-132) 


where one needs to perform two integrations by parts and to make use 
of appropriate boundary conditions on p to arrive at this formula (check 
this out). We recall that p(= p;) is the probability density corresponding 
to the measure ju,(dX) = p,dX. We now consider the limit t — oo when 
the measure ju,(dX ) becomes singular and converges (weakly) to the SRB 
distribution for a NESS under the action of a thermostat. We thereby get, 
for such a NESS, 


S= / Ugrp(dX)V - F(X). (9-133a) 
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The physical interpretation of S in this formula is that it represents the 
rate of flow of entropy into the system under consideration from the en- 
vironment made up of the thermostat(s) which is of notional existence in 
the case of a mechanical thermostat, i.e., one simulating the action of an 
actual thermostat by means of additional terms in the equations of mo- 
tion as in the case of a GIK, GIE or N-S thermostat introduced in sec. 9.9. 


The rate of flow of entropy from the system to the environment is then 
Glthermostat] _— _ J vsnaldx¥ : F(X). (9-133b) 


Finally, since this amount of entropy flows from the system to the envi- 
ronment while, at the same time, the system itself is in a steady state, the 
rate of production of entropy due to the irreversible dissipative processes 


taking place in it has to be equal, which leads us to the formula 
cae / usre(dX)V - F(X). (9-133c) 


The entropy accounting in the NESS can thus be stated as follows: 


the rate of production of entropy in a NESS due to irreversible processes 
occurring in the system (eq. (9-133c)) equals the rate of entropy flow to the 


environment (eq. (9-133b)). 


In the case of a mechanical thermostat simulating one or more actual ther- 
mostats the formula (9-133b) is to be looked upon as an interpretation of 
the rate of flow of entropy out of the system. One can, for the sake of 
concreteness, think of an actual thermostat system causing a NESS to be 


established. For instance, a steady flow of heat through a uniform rod 
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of cross-section A and length L can be maintained by means of two ther- 
mostats, one delivering heat quasi-statically to the system at a temperature, 
say, T; at one end, and the other absorbing the same amount of heat quasi- 


statically at a temperature, say, T>(< T,) at the other end. 


If in time, 7, the heat transferred be Q then the rate of flow of entropy to 


the thermostats considered together is Slermostel = @(. — 7), In a steady 


state, the heat flux through the system is jg = 2, while the affinity is 


Ag = V(e) = ¢(q — 7). Here we have assumed that the affinity and cur- 


rent are both small, i.e., the system is in a steady state close to equilibrium 


TM4+T> 


at temperature —3 


. One thus finds that the rate of flow of entropy to the 
thermostats is S{thermestat] — AL jgAQq. In accordance with what we saw in 
sec. 8.2.2, the right hand side represents the rate of entropy production in 
the system due to the irreversible flow of heat whose definition is unam- 
biguous for non-equilibrium states close to equilibrium ones. This justifies 


the identification of S[thermostat] with Slitrevl | 


In this example, the external ‘field’ is non-mechanical, being in the nature 
of a temperature gradient. A mechanical example is that of an electric field 
(E) that produces Joule heat in the system which is to be transferred to a 
thermostat so as to keep the system in steady state. Once again, the rate of 
transfer of entropy to the thermostat (an actual reservoir of heat) is found 
to be equal to the irreversible entropy production rate in the system, both 
being given by the expression iz per unit volume (j stands for the current 
density; refer back to sec. 9.9.7 where we consider the case of the Lorentz 


gas). 


It is important to note that the right hand side of (9-133c) provides us with 


a possible indicator of irreversibility in non-equilibrium steady states of 
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macroscopic systems, whose applicability is not confined to near-equilibrium 
situations. However, in order that this be actually acceptable as such an 
indicator, one needs to establish that it is a non-negative quantity, and 
is zero only exceptionally (e.g., in the case of an equilibrium state for a 
closed Hamiltonian system when VF' = 0 and jugrg reduces to the micro- 


canonical distribution). 


From the definition of F(X), one identifies —V -F as the phase space con- 
traction rate ,(X) (refer back to sec. 9.9.6) defined locally in the phase 
space. On the other hand, the phase space contraction rate is determined 
by the local Lyapunov exponents as in (9-1 13a) since the latter determine 
the instantaneous stretching rates of infinitesimally small line elements 
along local stretching directions e; mentioned in sec. 9.5.2.2 correspond- 


ing to each of the Lyapunuv exponents. 


In other words, as stated in sec. 9.9.6, the the rate of entropy production 


in a NESS is given by 


Glirrev] _ _ Si); (9-134) 
where the right hand side involves an average over the steady state dis- 
tribution jsrg. However, because of the ergodicity of srg, the Lyapunov 
exponents are constant almost everywhere on the attractor (recall that 
the support of sre is a fractal set), and the averaging ((-)) is actually 


redundant. 


Formula (9-134) is of major relevance since it expresses the rate of en- 


1384 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


tropy production, a macroscopically defined quantity, in terms of quan- 
tities defined with reference to the microscopic dynamics. However, one 
still needs to establish that this quantity is positive (reducing to zero in 
the case of an equilibrium state) so as to establish its role as the indicator 
of irreversibility of dissipative processes in the system under considera- 
tion. This was established by Ruelle in [123] for a large class of systems, 
where he considered smooth dynamical systems defined by mappings on 
compact manifolds, and where a mapping may arise as the Poincare sec- 
tion of a flow. The formula for the entropy production in terms of the av- 
erage phase space contraction rate in the case of a mapping is of the same 
form as in the case of a flow, and is hence given by (9-134). The mappings 
considered in [123] (recall earlier mention in sec. 9.9.6) include diffeomr- 
phisms, a class of non-invertible maps, and also a class of systems where 
the map has a non-attracting set that carries a non-equilibrium measure 


j associated with a diffusion process. 


The last-named case pertains to hydrodynamic modes characterized by frac- 
tal structures in extended Hamiltonian systems with quasi-periodic bound- 


ary conditions, considered in [52]. 


In the case of a diffeomorphism, the proof of positivity of the entropy pro- 
duction is based on the fact that, if gpp(dX ) represents the SRB measure 
for the map ¢ then, generally speaking, it does not represent the SRB 
measure for the inverse mapping ¢~'. In the special case that jigpg does 
happen to be the SRB measure for ¢~! the entropy production reduces to 


zero, this being the case for the Poincare map for a closed Hamiltonian 


1385 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


system, for which the SRB distribution is smooth along both the stable 


and unstable manifolds, and reduces to the microcanonical distribution. 


Recall that, if h, be the KS entropy, with respect to a measure ju, of a dy- 
namical system defined by the mapping ¢, then, in general, it satisfies the 
inequality (9-64), while the Pesin equality (eq. (9-63b)) is satisfied if 4 hap- 
pens to be natural distribution for the dynamics defined by 4, i.e., precisely, 
is the SRB distribution for ¢. The proof establishing the positivity of the 


entropy production rests on this inequality. 


Moreover, the singular nature of juggp (which is the case for a thermostated 
system) guarantees that, for a large class of mappings ¢, S!*"*"! is actually 


positive [123]. 


We now work out the phase space contraction rate for the GIK thermostat 


by referring to (9-112c), (9-107a), from which we get 


xX) =) 5 (op) 


= = Pita) + “Pi, (9-135a) 


a 


where the phase space dependent parameter a is given by the first line 
of (9-107b), which implies that the constraint need not be taken into ac- 
count separately. Assuming that the forces F;, FO"), F{" (i = 1,2,--- , N) 
depend only on the position vectors of the particles, and noting that there 


are a total of 3N components of the momenta p; (i = 1,2,--- , NV), one ob- 
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tains the local phase space contraction rate as 


x(X) = (3N — 1a, (9-135b) 


(check this out; this is consistent with (9-116b) where one has N = 1 and 
E, p are 2D vectors). We now evaluate the expectation value with respect 
to the NESS distribution by recalling that there are 3N — 1 number of inde- 
pendent momentum components and by defining a kinetic temperature 


T as 


(3N — 1)(5e7) = Pi : (9-136) 


— 2m; 
u 


Assuming further that the internal forces on the particles are all derivable 


int) 


from a potential ® (F — oe), and observing that, in the steady state & 


has to be time-independent, we obtain 


as ci (9-137) 


(check this out) where the left hand side refers to the phase space contrac- 
tion rate per unit volume (in physical space). One can further compress 
this expression by assuming that the external force is independent of the 


particle index i and noting that 
1 is 
> =i. (9-138a) 


the total particle current density due to the external force F). This 


gives, finally, the entropy production rate per unit volume (in which the 
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factor kg is restored; recall that, in the earlier paragraphs, the entropy 


production was defined in units of kg) as 


GIK thermostat ; sl"! =j-A, (9-138b) 


where A = eo defines the thermodynamic affinity causing the flux j. 
This expression is formally consistent with the one arrived at (refer to 
formula (8-10b)) from considerations in non-equilibrium thermodynam- 
ics based on the assumption of local thermodynamic equilibrium. Recall, 
however, that, in the present context, T is a notionally defined temper- 
ature and the steady state under consideration need not be close to an 


equilibrium configuration. 


An analogous exercise (see [21]) carried out for the GIE thermostat gives 


the same result, but now only for large N, 


GIE thermostat : s#vl =j-A (N — oo), (9-138c) 


In other words, the various different thermostating mechanisms lead to 
equivalent results in the thermodynamic limit (refer back to earlier state- 


ments in this regard in sec. 9.9.7). 


9.12 Gallavotti-Cohen fluctuation relation 


The assumptions of dissipation, reversibility, and strong chaos (the ‘chaotic 


hypothesis’) lead to a number of conclusions that are likely to be of rele- 
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vance in the understanding of non-equilibrium configurations of macro- 
scopic systems. Among these, the fluctuation theorem of Gallavotti and 


Cohen ([46], [49]) will be outlined in the present section. 


We consider a macroscopic system in a non-equilibrium steady state 
(NESS) and recall the description of the microscopic time evolution with 
reference to an appropriate Markov partition as in sec. 9.10.3. We de- 
scribe the time evolution in terms of a mapping ¢, where the mapping 
X — $(X) may represent the result of an underlying flow during a short 
interval of time. In the following, we will consider the evolution over a 
long stretch of time 7 (not to be confused with the same symbol denoting 
temperature, which will not appear in the present section), divided into 
short stretches of duration 7 each, during which the map operates on 
the representative point X in the phase space vy number of times. If to 
be the average time between successive actions of ¢ (the interval between 
successive crossings of a Poincare surface of section, for instance), then 
T = vty. For the sake of convenience of presentation, we take v to be an 


even integer. 


Fig. 9-10 depicts a trajectory of the representative phase point (of no- 
tional significance) covering a stretch of time 7, divided into segments 
of duration 7 each, as indicated above. The phase space is imagined to 
be partitioned into small cells by means of a Markov partition that can 
be made as fine as we please where, starting from any Markov partition, 
one can apply the map ¢ any number of times in the forward and back- 
ward directions and take all possible intersections of the resulting cells 


to arrive at a partition with cells of desired degree of smallness. 
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Figure 9-10: A stretch of trajectory of duration T of the representative point in 
the phase space, divided into segments of duration 7; the phase space is imag- 
ined to be partitioned into small cells by means of a Markov partition (not shown) 
that can be made as small as we please; for each of the segments of duration 
T, one can define a scaled entropy production a, as in (9-139), where a, is a 
random variable, assuming different values for various different segments with 
centers (X) varying from one segment to another; while the entropy production 
is positive for the majority of segments, it assumes negative values as well be- 
cause of their finite duration; the probability density p,(a) of the random variable 
at any chosen value a satisfies the Gallavotti-Cohen fluctuation relation (9-145) 
(for notation, see text). 

Recall that the possibility of dividing up the phase space by means of Markov 

partitions made up of cells of arbitrarily small size is a consequence of the 

chaotic hypothesis where one assumes that the microscopic dynamics can 


be effectively described as that of an Anosov system or, more generally, as 


that on an Axiom-A attractor. 


As mentioned in sec. 9.10.3, each application of the map ¢ at a phase 
point X leads to a volume expansion by a factor A(X) along the unstable 
manifold (i.e., when we consider the mapping restricted to the unstable 
manifold of X). Further, with each point X, we can associate a phase 


space contraction rate y(X), which can also be described as an entropy 
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production rate, o(X) (for notation, see below). 


Imagine, for a given phase point X, the points ¢/(X) with the integer index 
j ranging from —% to § —1 (i.e., for y number of successive applications of 
@, covering a time 7), with reference to which we can define a scaled en- 
tropy production rate a,(X) (averaged over an interval 7) as in (9-139) be- 
low, where the scaling is against the mean entropy production (c) i.e., the 
phase space contraction rate averaged over the SRB distribution charac- 
terizing the NESS. Recall that this was denoted by the symbol s!"! in 
sec. 9.11.2. However, while ¢'*"°’l represents the entropy production per 
unit volume, (c) is defined as the entropy production per degree of free- 
dom, though the distinction is of minor relevance, introduced in order to 


conform to the notation in [49] (from which our notation differs in details). 


It is important to mention at this point that the equality of the time average 
and the phase space average in the case of the SRB distribution holds only 
when the time average is evaluated over the forward evolution since the dis- 
tribution for the backward evolution is different in the case of a dissipative 
map. This is in contrast to the case of a closed Hamiltonian system where 


the forward and backward averages give rise to the same natural measure. 
We define a,(X) (7 =vto) as 
1 ; 
o,(X)= - (do) X) =e )ae(X }. (9-139) 


In this formula, o,(X) denotes the phase space contraction per degree 


of freedom averaged over the segment of trajectory of duration 7 cen- 
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tered around X (corresponding to y number of successive iterations of 
the map), as given by the first equality. Since the averaging is performed 
over the finite interval 7, there will be fluctuations in a,(X) when evalu- 
ated over distinct segments (all of duration 7) of the trajectory (considered 


over a long stretch of time 7). 


In other words, when considered over distinct segments of duration 7 
in a trajectory of length T(— oo), a, is a random variable (the point X 
at the ‘center’ of a segment varies for the various different segments), 
with a probability distribution p,(a), where p,(a)da is the probability of 
the variable a, having a value in the range a to a+ da. Though the entropy 
production averaged over an infinitely long stretch of trajectory (which 
is the same thing as the phase space average over the SRB distribution; 
recall that the time average is to be evaluated under forward evolution 
only) is positive, p,(a) can be non-zero for negative values of a as well, 
since an average over a short duration of time is not equivalent to an 


average over the entire SRB distribution. 


The Gallavotti-Cohen fluctuation theorem states that, for any a > 0, the 
ratio of p,(a) (the probability density for the random variable a, to have 
the value a) and p,(—a) (the probability density for the value a, = —a) is 


given by 


p,(a) mw e2Mr(o)a (9-140) 


(see [49] for further details of the statement of the theorem; the notation 


in this reference is slightly different from ours), where / is the effective 
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number of degrees of freedom; as mentioned above, 7 = vty is the time of 
duration of each of the large number of segments in which the entropy 
production (or, equivalently, the phase space contraction) o, is evaluated, 
the total stretch of time for all the segments taken together being T = 2Jto, 
say, where the integer J is to be compared with the corresponding symbol 


introduced in sec. 9.10.3. 


We now outline the derivation of (9-140), following [49], [50], where fur- 
ther details are to be found. In this, we refer back to sec. 9.10.3 where 
the SRB distribution is approached in a limiting sense in terms of a set 
of dynamical weights assigned to the cells of a sufficiently fine Markov 
partition such that, given a phase point X in any of the cells, the points 
obtained by applying iterations ¢* on it for —J < k < J can all be resolved 
within distinct cells. Thus, the weight assigned to a cell E;, where j is 
an index labeling the various cells in the partition, is proportional to w,,; 
given by (9-121). In this formula, w,, is nothing but the phase space 
expansion coefficient (along the unstable manifold) under the application 
of the map ¢” applied to ¢~2.X; (which equals the absolute value of the 
determinant of the relevant Jacobian), where X; is any point in Ej; we 
call this expansion coefficient A,,;. Considering the locations of the phase 
points marking the events of the action of the map ¢ during all the 7- 
intervals, and looking at the cells (£;) in which the phase points lie, we 
identify all the cells for which the value of the random variable a,(X,;) has 
the value a, and also those for which it has the value —a for any chosen 


a(> 0). 
At each of the cells so identified we evaluate the SRB weight w,, men- 
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tioned above (refer back to (9-121)). Then, the left hand side of (9-140) 
is given by the ratio of the aggregate of the SRB weights of the cells for 
which a,(X,;) = a and the corresponding aggregate for the cells for which 


a,(X,;) = —a (in the limit J — oo): 


pr(a) = oj Wn,j 
p,(—a) ye Wj 


(9-14 1a) 


where the single prime in the numerator on the right hand side indicates 
that the sum is to be evaluated over those cells for which a,(X,;) = a, while 
the double prime in the denominator denotes a sum over cells for which 
a,(X;) = —a. Recalling that the (un-normalized) SRB weight wv,j) is 
nothing but the inverse of the dynamical expansion coefficient (along the 
unstable manifold; note slight change in notation as compared to (9-12 1)) 
corresponding to the application of the map ¢” at ¢~2X,;, and denoting 
this by (A,,;)~' we can write (9-141a), for the sake of notational clarity, in 


the alternative form 


p,(a) = yi (Avg) 


p,(—a) YG (9-141b) 


where the ratio on the right hand side makes unnecessary the normal- 


ization of the relative weights w,,; = A,,;. 


At this point we need to make use of the reversibility of the dynamics, 
i.e., the existence of a reversing transformation FR as in sec. 9.10.2 (R? = 
I,R¢R = ¢'). Under the reversing transformation, the stable manifold 


of a point in the phase space gets transformed to the unstable manifold, 
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and the following relations hold 
a,(X) = =@(RX), Ce = Pai Big (9-142) 


where, in the second equation, ,,2; stands for the expansion coefficient 
(but now along the stable manifold), i.e., the determinant of the relevant 
Jacobian matrix, evaluated at RX;, i.e., a point within the cell resulting 
from the action of the reversing operator R on the cell EL; (once again, the 
notation differs from [49], [50] for the sake of simplicity of presentation). 


One can then express the ratio in (9-141b) in the form (refer to [49]) 


p-(a) _ DU, me 
7a) = 7 os ‘ (9-143) 


Notice that, the summation in the denominator on the right hand side 
is now on the cells where a,(p) = a (as indicated by the single prime at- 
tached to the summation sign) since, under the reversing transformation, 
a gets changed to —a. In other words, now there is a pair-wise correspon- 
dence between terms in the numerator and the denominator of (9-143) 
where the summands in a pair of corresponding terms are (A,,;)~! and 
Ay, their ratio being Usa) Age Recalling the definitions of the two terms 
in this expression, one observes that their product is related to inverse 
of the overall expansion coefficient (including the expansions along both 
the stable and unstable manifolds) in y number of applications of the 
map from ¢ 2X; to ¢2-1X;, differing from the latter because of the angle 
between the relevant stable and unstable manifolds which are transver- 
sal (by the property of hyperbolicity) but are not necessarily orthogonal 


to each other. 
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Given any two points X, X’ in the phase space, if we define 


cae | sin 0(X)| 


—_— -144 
X,x’ |sinO(X")|’ © 


where 6(X) denotes the angle between the stable and unstable manifolds 
at X, then we have 0 < B < oo. Further, the inverse of the overall expan- 
sion coefficient between ¢- 2X; and ¢2X; (the latter obtained by the action 
of ¢ on $27!X,) is, by (9-114), nothing but exp Mva,(X;) where o,(X;) is 
the phase space contraction per degree of freedom (per single applica- 
tion of the map ¢) between ¢° 2X; and ¢2X; (refer to eq. (9-139); recall 
that stands for the effective number of degrees of freedom of the sys- 
tem under consideration). In other words, Or a or is bounded between 
exp Mvo,(X,;) +1n B (reason this out). What is more, the prime attached to 
the summation signs in (9-143) tells us that only those cells are included 
in either summation, for which the random variable a,(X;) has the value 


a. 


Thus,finally, we obtain, 


= exp Mr(o)a, (9-145) 


where we have omitted a correction term in the argument of the expo- 
nential on the right hand side (recall the ‘~’ sign in (9-140)), the error 
being independent of a and r. In other words, one can interpret the above 
equation as a large deviation formula that becomes exact in the limit of 
large values of ra. Note further that, we have now made use of a modified 


definition of o on the right hand side of the equation as the entropy pro- 
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duction per degree of freedom per unit time (instead of per application of 
the map ¢; recall that 7 = vto), keeping in mind that the map may actually 
be the result of an underlying flow operating for time t) (we assume for 
simplicity that the time interval between successive actions of the map is 


constant). 


The fluctuation theorem has been verified in numerous NEMD simula- 


tions. 


9.13 Perturbations of non-equilibrium steady 
states 


Analogous to the linear response theory around equilibrium states, there 
have been efforts to develop the theory of time-dependent processes around 
non-equilibrium stationary states, where the latter may be ones far from 
equilibrium. Thus, for thermostated systems with far-from-equilibrium 
stationary states described by SRB distributions, one needs to consider 
time dependent perturbations of the SRB distributions in order to de- 


scribe and explain time-dependent processes. 


Efforts in this direction have resulted in the development of a general 
theoretical framework [125], though there remain questions of a technical 
nature as to the applicability of that framework to concrete problems, 
since the mathematical requirements underlying the theory appear to be 


too stringent for systems of physical interest. 


Fundamentally, one needs to work out the deviation, in the linear ap- 
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proximation, from the SRB distribution characterizing a NESS. This is, 
in principle, possible in the case of an Anosov system or an Axiom-A at- 
tractor because of the property of uniform hyperbolicty, where the local 
Lyapunov exponents are bounded away from zero independently of the 


location in the phase space. 


Most systems of practical interest are, however, non-uniformly hyperbolic 
(though, even hyperbolicity is of doubtful validity for systems with a large 
number of particles). It is likely, however, that in the thermodynamic limit, 
the results derived on the basis of stringent mathematical assumptions may 
enjoy substantial validity. Such appears to be the case with the ergodic 


hypothesis and, presumably, with the chaotic hypothesis too. 


9.13.1 A general framework for non-equilibrium processes 


In the following, we set up the formal statement of a liner perturbation 
formula for the SRB state of a time independent dynamics, where the 
perturbation may depend on time. An even more general situation relates 
to the time dependent perturbation of an unperturbed dynamics that is 
itself time dependent. Thus, let us consider a time dependent flow of the 
form [125] 


dX 


Gp = F(X) + 6F(X), (9-146) 


where X stands for the location of the representative point in the phase 


space, indicating the instantaneous microscopic state of the system un- 
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der consideration, Ff, determines the unperturbed time-dependent flow, 
while 6F; stands for the perturbation, assumed to produce only a small 
effect on the unperturbed flow. The basic assumption is that for any 
given t, the right hand side of (9-146) corresponds to a vector field that 
is close to one describing an Anosov flow (the case of mappings close to 
Anosov diffeomorphisms has been considered in greater details in the lit- 
erature; in any case, the results stated below for time dependent flows are 
to be treated as formal ones). By the structural stability of Anosov sys- 
tems, one can define a SRB state for a dynamics determined by fF; + dF; 
for any specified t. In the following, the SRB distribution corresponding 
to the flow defined by F; for any chosen fixed value of t will be denoted 
by y:, while the perturbed distribution, again for the arbitrarily chosen 
but fixed value of t, will be denoted by jp; + du;. We denote the evolution 
operators for the unperturbed and perturbed flows by U,, and U;, + dU;,; 


respectively. 


In the case of a time-independent flow X = F(X), the transformation from 
an initial position X(s) to a final position X(t) in the phase space is of the 
form X(t) = U,_,X(s), i.e., the evolution operator depends only on the dif- 
ference of the final and initial times. For a time-dependent flow of the form 
X = F,(X), however, the evolution operator does not, in general, depend on 
the initial and final times s,t only through the difference t — s, and one has, 


more generally, a transformation of the form X(t) = U;,,X(s). 


Starting from a probability distribution p(X) (i.e., from a measure p(dX) = 


p(X)dX, absolutely continuous with respect to the Lebesgue measure in 
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the phase space), one defines the SRB measures pi; and ju; + dp; as 


Hild) = lim p(ooUi.), 


s—>—CO 


He(d) + Oue($) = lim p(d0 (Urs + OUis)), (9-147) 


where ¢ stands for a sufficiently well-behaved test function and j(¢) for 
the expectation value of ¢ with respect to the probability measure p(dX), 
given by u(¢) = [ u(dX)¢(X). At earlier places in this book we have de- 
noted this by (¢),,. Further, a symbol such as ¢oU denotes a composition 
of the transformations U and ¢ such that (¢0 U)(X) = ¢(U(X)). 


For any arbitrarily chosen but fixed ¢, the right hand side of the first line 
of (9-147) gives the measure of the function ¢, evolved by U;,, for the chosen 
and specified t, from the infinitely remote past (s + —oo) up to the time t; by 
the definition of the SRB distribution, this represents the measure (i.e., the 


expectation value) of ¢ with respect to to juz. 


By a systematic but formal perturbation calculation one arrives at the 


following result: 
t 
dur(d) i=] / dr i: [i (dX ) (Tx (7) U er) OF (X (7, t))) . Vx¢@. (9- 148) 


[Notation] In formula (9-148), an expression of the form T;,U denotes the 


tangent map to U at X, i.e., in terms of components (such as X; (i = 1,2,---) 


aU (X)); = 


of X in the phase space), the Jacobian matrix with elements “37 (i,j = 
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1,2,---). In other words, 


0 


((Lx(7,t)Us,r OF (X (1, t -Vx¢d= ye AO Oi OF AX nt sax 


o(X). 


(9-149) 


Further, in (9-148) and (9-149), X(t,7) stands for the solution to the unper- 
turbed equation X = F;(X) (recall that our aim is to evaluate 5,1;(¢) formally, 
up to terms of the first order of smallness in the perturbation 6F;,(X)) that, 
when substituted in the latter for any initial value X(7) = X(7,7) at time 7, 
turns it into an identity; i.e., X(t,7) is the phase point resulting from X(r) 
when the latter is propagated by U;,,. Finally, in (9-148), X stands for X(r) 


at large negative r. 


[Note on derivation] The central result (9-148) is obtained by starting from 
an initial distribution p(dX) (understood to be at initial time s — —oo) and 
by expanding the right hand side of the second equality in (9-147) up to 
the first order of smallness, while canceling the leading term by the first 


equality. This gives 
Buu) = tim / a(aX)Vxc0)- 6X (t, 8), (9-150a) 
where 6X (t,s) is the first order correction to X(t, s), given by 


t 
ax(t,s) = | dt (Tx (7,s)U2,r)6F (X(T, 8), (9-150b) 


and where an alternative form of the above formula in terms of components 
is 
: OX; (t, 5) 
Xi(t,s)= | d ——— (6 F(X (7, 7 
AXi(t.s) = far BR ray OF KO 
Substitution of (9-150b) in (9-150a) along with a number of formal manipu- 


lations like change of variables within an integral, and the use of a formula 
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of the general form 


(F,,sP)(%) = p(¢° U;,s), (9-151) 


leads to (9-148). In formula (9-151), F,,, stands for the Frobenius-Perron 
operator corresponding to the unperturbed evolution operator U,,,.. For 


more precise explanations, refer to [125]. 


9.13.2 Perturbations over steady state distributions 


As mentioned above, the perturbative result (9-148) is formal in nature, 
since the differentiability of the time dependent SRB distribution ju, is not 
established on rigorous grounds. The case of a time-independent per- 
turbation (5 f/) over an unperturbed dynamics determined by an Anosov 
diffeomorphism f on a compact manifold has been treated rigorously 


in [124] where one obtains 


= So u((6fe fo!) Vibe f"), (9-152a) 
n=0 
or, more explicitly, 
love) 3 9 
wd) =O | wl (AX) (6F(F "Xia OFX). (9-152b) 


ll 
° 


n 7=1 


A particular instance of (9-148) is obtained for the case of a flow of the 


form 


X = F(X) +6 F(X), (9-153) 
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where 6,F'(X) (following notation used in [125]) represents a small time- 
dependent perturbation of a time-independent flow X = F(X), to which 
there corresponds a steady state described by the SRB distribution ji(dX ) 
(say). One then obtains the first order variation of j; under the perturba- 


tion 6,F', with t assumed to be held fixed at any chosen value as 


O:pu() = / dr / p(dX) ((Tx (rn Ur_-+) 67 F (X (7 — t))) - Vixd. (9-154) 


(where the notation differs slightly from that in [125]). Here the unper- 
turbed evolution operator of the form U,, depends on t,7 only through 
their difference t — 7 and, acting on X, the representative point in the 
phase space at time 7 gives X(t,7), which once again depends on t,7 only 


through t — 7 and is denoted by X(t — 7). 


Based on the definition of 46,1, one can interpret it as the instantaneous 
SRB distribution at time ¢ provided that 6,;F varies with ¢ sufficiently 
slowly. In other words, analogous to a quasi-static process through a 
succession of equilibrium states of a system, one can speak of a suc- 
cession of steady states described by a corresponding succession of SRB 
distributions. This idea of a slow succession through steady states can 
be made us of in defining an entropy function for non-equilibrium steady 


states. 


Formula (9-154) can be used to formally set up a linear response theory 
around a steady state where, generally speaking, the latter may represent 
a configuration far from equilibrium (the case of linear response close to 


an equilibrium state is outlined in sec. 9.13.3). Thus, one can define 
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a response function R, which is formally a linear operator that acts on 
vector fields (such as the one corresponding to the perturbation 6,F' for 
any specified value of t), such that the action of R, on a vector field ¥ 
(say) produces a function 7)(X) (formally belonging to the dual space of 
phase space functions ¢(X) looked upon as vectors, where the action of 
the dual ~ on the vector ¢ yields the scalar product f[ ji(dX)y(X)d(X)). 


One writes (9-154) in the form 


t 
Opt = / dT hin soc! , (9- 155a) 


—co 


where R, is defined as, 


(RX) =0 (for o < 0 (causality)) 


(R,X)$ = | Ji(dX)((Tx(-o)Us) X(U_gX))- Vx (for o > 0). (9-155b) 


It is straightforward to check that this definition of R, gives (9-154) from (9-155a), 
which constitutes the formal generalization of the response function con- 


stituting the basis of linear response theory near equilibrium states. 


Analogous to the theory of linear response, a broader understanding of 
non-equilibrium processes close to a far-from-equilibrium steady state 
may be obtained in terms of the Fourier spectrum of the response func- 
tion R,. The dynamic susceptibility y,, which, like R,, transforms a vector 
field into a function 7 belonging to the dual of the vector space of phase 
space functions (such that the action of ~ on a function ¢ results in the 


inner product integral | ji(dX)u(X)¢(X)), is defined as the Fourier trans- 
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form 
Xo = | Ractora. (9-156) 


According to this definition, choosing a vector field VY and a phase space 
function ¢, the response of the system under consideration to a periodic 
perturbation applied to a steady state is of the general form (x...) (refer 
to the second line of (9-155b)). 


The causality property (R, = 0 for o < 0), implies that y,, can be extended 
to an analytic function in the upper half of the complex w-plane, which 


further leads to the Kramers-Kronig dispersion relation 


x= oP f Ade, (9-157a) 


i.e., for any given vector field 7 and a given phase space function ¢, 


(xc¥)¢ = : P ii (Xv )$ (9-157b) 


um w—C 


where ? denotes the principal value of an integral. This makes possible 
the formulation of the linear response theory in the frequency domain in 


analogy to sections 8.4.7, 8.4.8. 


One can, moreover, define the time correlation function of any two ob- 
servables (i.e., sufficiently well-behaved phase space functions) A, B with 
respect to a SRB measure ji so that its Fourier transform (AB);(w) can 
be extended in the complex w-plane to a meromorphic function in the 


domain |Imw| < A such that the poles in the lower half of the complex 
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plane determine the exponential decay of the time correlation function 
for large positive t. These, precisely, are the Ruelle-Pollicott resonances 
referred to in sec. 9.5.7. Though the rigorous demonstration of such ex- 
ponential decay requires a number of stringent assumptions (including 
the one of uniform hyperbolicity), it appears that such assumptions are 


not essential for a system with a large number of particles. 


Finally, [125] explores the possibility of extending the fluctuation-dissipation 
theorem to the more general situation of non-equilibrium processes close 
to a far-from-equilibrium steady state. As one observes from (8-161b), 
the fluctuation-dissipation theorem relates the singularities of the dy- 
namic susceptibility to those of the Fourier transforms of the equilibrium 
correlation functions. In the far-from-equilibrium situation, the dynamic 
susceptibility splits into a ‘stable’ and an ‘unstable’ part (y‘*)(w), x (w)), 
corresponding to the splitting of the tangent space at any given point of 
the phase space into a stable and an unstable subspace for a hyperbolic 
system (ignoring the one-dimensional subspace tangent to the trajectory 
through the chosen point). Of the two, the singularities of y“)(w) are re- 
lated to those of the correlation function in the frequency domain (these 
singularities are independent of the observables A, B referred to above), 
which can be interpreted as the required extension of the fluctuation- 
dissipation theorem; the singularities of y‘)(w), on the other hand, are 


not so related. 
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9.13.3 Linear response theory revisited 


The consequences of the chaotic hypothesis, coupled with the feature 
of reversibility have been inquired into in the case of concrete model sys- 
tems in [45] involving Gaussian constraints that induced non-equilibrium 
steady states where more than one fluxes and affinities were mutually 
coupled in each of the models. Assuming that the models represent large 
numbers of particles (V >> 1), approximate expressions were set up for 
the phase space contraction rates and the rates of entropy production by 
making use of the SRB distributions constructed along the lines indicated 


in sec. 9.10.3. 


The equations of motion in each of the models depend on a set of param- 
eters (external fields and a temperature gradient) a, (a = 1,2 in one of the 
models corresponding to two coupled currents; in the following, variables 
such as a,7¥y will be assumed to take up values 1,2) that could be treated 
as macroscopic forces (‘affinities’ in the thermodynamic interpretation). 
In the approach based on the chaotic hypothesis (we call this the ‘SRB- 
based’ approach), macroscopic fluxes could be defined as (co stands for 


the entropy production) 


Oo 


jo = Fe (9-158) 


This makes possible the definition of fluxes (corresponding to affinities) 


for steady states far from equilibrium. 


With fluxes and affinities so defined, kinetic coefficients can be defined 
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as 


Oda 


Le = tae 
2 Ojy 


ea any = 1,2), (9-159) 


where the sub-index ‘eq’ indicates that the derivatives are to be evaluated 
at equilibrium (h; = hz = 0). In other words, the kinetic coefficients were 
defined as in the traditional approach since these cease to be parameter- 
independent in the far-from-equilibrium regime. It then turns out that 
the Onsager reciprocity relations can be derived with the SRB-based def- 


initions of currents inserted into (9-159). 


At the same time one can make use of locally defined current densities 
introduced in terms of the local phase space contraction rates. It turns 
out that the Kubo formulae relating the kinetic coefficients to the time 
correlation functions of the currents are obtained as a consequence of 


the chaotic hypothesis along with the reversibility assumption. 


While these results have been obtained in the SRB-based approach for 
a number of specific models, a more general theory can be outlined by 
adapting the formulae of sec. 9.13.1, 9.13.2, where the basic results 
on time-dependent SRB states were stated. Recall that, formally speak- 
ing, eq. (9-148) describes the SRB state for a time dependent pertur- 
bation over a flow that may also be time-dependent (refer back to for- 


mula (9-146)). 


These time dependent results are to be interpreted with care. Thus, in (9-148), 


ju, and py, + du, are the SRB states for the flows X = F,(X) and X = F,(X)+ 
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bF,(X) with t held fixed, i.e., with these representing ‘time-independent’ 
flows, assuming that the right hand sides of these equations to be held 
frozen at some chosen value of t. In other words, the time dependence of 
the equations of motion (i.e., of the relevant vector fields) is required to be 


infinitesimally slow. 


Particular instances of (9-148) describe various different situations of in- 
terest. Thus, we may consider time-independent perturbations of time- 
independent mappings (where ‘time’ is represented in terms of discrete 
integers) for which (9-152a) can be derived rigorously provided the map- 
ping is an Anosov diffeomorphism defined over a compact manifold. In 
this case, ;, and dy describe, respectively, the actual SRB distribution for 


the unperturbed mapping and the first order perturbation over it. 


Likewise, eq. (9-154), which again is of a formal nature, gives the time- 
dependent perturbation 6,4 over the SRB distribution jf describing the 


steady state for a time-independent flow. 


One can specialize further to the case when the perturbation is over an 
equilibrium state, and hope to recover the results of linear response theory 
of non-equilibrium processes close to equilibrium. This would provide 
the theoretical basis for the results of the SRB-based analysis of concrete 


systems obtained in [45]. 


In this context, we mention that the theory of time-dependent perturba- 
tions of time-independent diffeomorphisms has been addressed rigorously 


(subject to mathematical stipulations of a technical nature), giving a gen- 
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eralization of (9-152a) in the form [51] 


s 


dud) = DY w(V(bo fe)» Xn): (9-160) 


n=—CcoO 


Here the sub-indexes s,n correspond to discrete time described by inte- 
gers. The diffeomorphism f describes the unperturbed dynamics, while 
X,, stands for the time-dependent vector field describing the (time depen- 


dent) perturbation 6,,f. 


The convergence of the infinite series on the right hand sides of (9-152a) 
and (9-160) have been established, providing the basis for the respective 


equations. 


While the results relating to diffeomorphisms are rigorously derived ones, 
we refer to the case of flows where, starting from the general formula (9-148) 
one can specialize step by step as follows: (1) time dependent perturba- 
tions over time independent flows, giving the ‘instantaneous’ SRB state 
(i.e., one corresponding to the perturbation imagined to be held frozen 
for some chosen value of the time #4) as a perturbation over the fixed 
SRB state corresponding to the unperturbed flow (formula (9-154)), (2) 
time-independent perturbations over time-independent flows (see below), 
where one obtains a SRB state ji + du differing slightly from the unper- 
turbed SRB state ji and, finally, (3) time dependent perturbations over an 
equilibrium state, describing linear response in a close proximity of the 


latter. 
We mention, for the sake of completeness, the relevant formula for the 
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step (2) above, obtained from (9-148), 


Ce i / ji(dX)V x(($ 0 UX) F(X), (9-161) 


where ji stands for the SRB state corresponding to the time-independent 
flow (X = F(X)) described by the evolution operator U; while ji + 51 is 
the perturbed distribution (up to first order of smallness) for the flow 
X = F(X) +6F(X). As mentioned above, formula (9-152a) is a rigorous 
statement (when considered along with appropriate mathematical stipu- 
lations including the one of uniform hyperbolicity), while eq. (9-161) is of 
a formal nature, though it is likely to prove effective for practical purposes 
in the case of physical systems made up of a large number of particles 
(recall that the ergodic hypothesis and the chaotic hypothesis enjoy a 


similar status). 


Finally, we turn our attention to the case of linear response close to equi- 
librium where we indicate how the response can be obtained in a ‘SRB- 
based’ formalism as distinct from the traditional approach outlined in 
sec. 8.4. For this, we refer to the causal response function Rk, defined 


in (9-155b) where an equivalent expression for (R,)¢ (co > 0) is 


(RoX)o= | wdV)X(Y)- Vy(60Us) 
=~ f navy XW) 6ULY) (0 > 0), (9-162) 
(check this out; this requires a change of variable under the integral sign 


in formula (9-155b); the second line is obtained by performing an inte- 


gration by parts, which is possible in the special case of an equilibrium 
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state). 


Recall that, formally, R, acts on a vector field X which in the present context 
represents the perturbation of a flow, so as to produce a function w(X) that 
‘acts’ on an observable ¢ so as to yield f[ ji(dX)wW(X)¢(X). Recall further that 
ji represents the SRB state for the unperturbed flow X = F(X), while R, 


stands for the response to the perturbation 6,F(X) over F(X). 


We recall the notation that the first order perturbation 6, over ji resulting 


from 1; = 6,F is given by 


Sifu = | do RgXio- (9-163) 
0 


Recall further that ji + 6, represents the SRB state for the flow X = 
F(X) + 6,F(X) where, on the right hand side, t is assumed to be held 
frozen at some chosen value. This can be interpreted as the instanta- 
neous SRB state of the perturbed system provided that the perturbation 


is sufficiently slow. 


In the case of a time independent perturbation 6;F'(X) = dF (X), one obtains 


the time indepndent perturbation 6 as 


Pita / © do f WlaX\(T -6F(X)) Uo), (9-164) 


(refer back to formula (9-161) for comparison) which holds when jfi(dX) has a 
density with respect to dX, as in the case of the microcanonical distribution 


for a Hamiltonian system. 
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The response in the frequency domain is described by the dynamic sus- 
ceptibility y.,, as outlined in sec. 9.13.2, in terms of which one can for- 
mulate the fluctuation-dissipation theorem resulting from the SRB-based 


approach. 


We next turn to the entropy production in the non-equilibrium process 
driven by the perturbation %, = 6;F close to an equilibrium state. Recall 
from sec. 8.5 that the entropy production attains a minimum value in a 
stationary state near equilibrium. The equilibrium being itself a particu- 
lar instance of a stationary state, one concludes that the entropy produc- 
tion in a time dependent state close to the equilibrium has to be of the 
second degree of smallness when considered in terms of the perturbation 
driving the time evolution of the state. This conclusion is borne out in the 
SRB-based approach under consideration (along with with the additional 
assumption of reversibility), since the expression for entropy production 


turns out to be [125] 
Or =) ds f WdX)V xX X)Vx&(UX) (9-165) 
0 


where Vx represents the divergence operator in the phase space and, 
in the present context, u(dX) = dX. In the particular case of a time- 
independent perturbation, the entropy production is given by the follow- 
ing simple expression, of the second order of smallness in the perturba- 


tion *¥ =é6F 


1 [o-e) 
o= >| ds [ dXVxX(X)VxU,X), (9-166) 
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in which the integrand is the time autocorrelation function of the phase 


space contraction rate corresponding to the perturbation 1. 


The derivation of the formulae (9-165) and (9-166) makes use of the fact 
that, in the case of an unperturbed motion determined by a Hamiltonian, 
VF (X)=0, i.e., the phase space contraction occurs only in virtue of the per- 
turbation. It is the phase space contraction that is at the root of entropy 
production. What is more, in the equilibrium state, (V%)=0, which holds 


for the microcanonical ensemble provided ¥ is sufficiently well-behaved. 


With this background in place, one can go ahead and establish ([{125], 
[51]) the fluctuation-dissipation theorem and the Onsager reciprocity re- 
lations, the two cornerstones of linear response theory, in the SRB-based 
approach expounded above, making use of the fact that (Vv) = 0, which 
can be assumed to be true for the equilibrium state (as can be seen by 
performing an integration by parts). The results provide support to those 
obtained in [45] in the case of a number of concrete systems with coupled 


affinities and fluxes. 


9.14 Transient fluctuation relations 


9.14.1 Fluctuation relations: background 


The transient fluctuation relations (‘fluctuation relations’ in brief) consti- 
tute a novel trend in non-equilibrium statistical mechanics, throwing 
light upon questions relating to irreversibility, dissipation, and relax- 


ation, and opening up broad areas of exploration in theory and appli- 
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cations. The validity of these relations is not confined to the domain 
of non-equilibrium steady states, and the theory resulting from these is 
distinct in this respect from the ‘SRB-based approach’ outlined in sec- 


tions 9.10, 9.12, and 9.13. 


This trend in non-equilibrium statistical mechanics was initiated in the 
early work of Evans, Cohen, and Morris, among others, and was given 
a definitive orientation by the formulation of the Evans-Searles fluctua- 
tion theorem (see [127] for reference to relevant literature). Among the 
numerous fluctuation theorems (FT’s) that followed (differing from one 
another in specific respects), those associated with the names of Crooks 
and Jarzynski deserve special mention. In the following, we will briefly 
outline the idea underlying these FT’s which, along with the Gallavotti- 
Cohen FT, constitute a major development in non-equilibrium statistical 


mechanics in recent decades. 


We will, in this section, focus on the Evans-Searles fluctuation theorem 
and briefly indicate the contents of the Crooks and Jarzynski fluctuation 
relations, all of these being of major relevance in present day explorations 
in non-equilibrium thermodynamics and its applications to nanosystems 


and to processes involving assemblies of biological macromolecules. 


As in all other areas of statistical mechanics, we distinguish between 
‘microscopic’ and ‘macroscopic’ features of a system, where the former 
relate to the equations of motion of its constituent particles and the latter 
to expectation values of a class of observables, as determined by proba- 


bility distributions over the microscopic states in the phase space. How- 
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ever, the thermodynamic limit is not referred to and the theory applies to 


‘small’ systems such as ones with N ~ 10° number of particles. 


A chaotic dynamical system such as the baker’s map may have a ‘phase 
space’ of a small number of dimensions and yet the phase space distribu- 
tion provides a description of the system that is complementary to that in 
terms of the flow equations or the relevant mapping. In other words, the 
complementarity between the two descriptions is not necessarily a conse- 
quence of a large number of degrees of freedom. Dynamical chaos, however, 
is not an essential pre-requisite of the fluctuation theorems to be introduced 


in the present section. 


Of fundamental relevance are (a) the feature of microscopic reversibility 
and (b) the initial phase space distribution from which the macroscopic 


process (one to which the FT is supposed to apply) is initiated. 


As a point of historical interest, it may be noted that Boltzmann put great 
emphasis on the initial condition in a macroscopic process in the explana- 
tion of irreversibility. In order that macroscopic irreversibility may emerge 
out of microscopic reversibility, the initial condition is to be, in a certain 


sense, a ‘typical’ one. 


We begin by highlighting three components of the equations of motion 
describing the microscopic dynamics. The basic ingredient in deter- 
mining the equations of motion is the Hamiltonian (commonly, a time- 


independent one) H(X,.) where X stands for the collection of the phase 
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space variables, namely, the position vectors r; (i = a,2,--- , N) of the par- 
ticles of the system (collectively denoted by Q in earlier chapters of the 
book) as also the momentum vectors p; (collectively denoted by P). In 
addition, the Hamiltonian may depend on one or more parameters \ (we 
consider only one parameter here for the sake of simplicity) that can be 
controlled externally so as to perform work on the system (for instance, 
the position of the piston in a cylinder containing a gas, which can be 
manipulated externally to perform work on the gas). We write the Hamil- 
tonian in form 


2 
a 


N 
Ax = Dt Vt ty, A(t)), (9-167) 
i=1 ¢ 


in a notation that is by now familiar, where the potential energy func- 
tion V, which is assumed to depend on the work parameter X(t), is left 
unspecified. Starting with the system in some equilibrium configuration, 
one may vary \ in any arbitrary manner (A = A(t)) so as to drive it out of 


equilibrium 


The second set of ingredients in the equations of motion is made up of the 
external dissipative fields (F*t) in (9-107b)) that set up non-equilibrium 
transport processes in the system and perform work on it, and drive it 
out of equilibrium without, however, altering the underlying equilibrium 


state for any specified value of A. 


Finally, we assume that the system exchanges heat with one or more 
thermostats where, for the sake of simplicity and concreteness, we as- 


sume the latter to be of the ‘mechanical’ type, modifying the equations of 
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motion in a deterministic manner, as explained in sec. 9.9. 


The resulting equations of motion, written for the sake of concreteness 


and easy reference, are of the form 


N 
Pi - (ext) 
= 2 OOF (t) 
ve 
2 (ext) _ 


These equations are written in a form more general than (9-107a) in sev- 
eral respects. The first term on the right hand side in the second line 
includes the force arising due the variation of the work parameter (A(t)). 
The external dissipative forces Fo) (i = 1,2,---,N), which may be time 
dependent, are assumed to couple to the particles of the system through 
the matrices of constants C;;,Dj; (i,j = 1,2,---,N) (for instance, there 
may be several species of particles coupling differently to the fields) and, 
finally, the constants G; (j = 1,2,--- , N) indicate that the various degrees 
of freedom of the system may be coupled differently to the thermostat. 
The dissipative current densities j; (¢ = 1,2,---,N) through which the 


external fields perform work on the system (of volume Y) is defined by 
1 Vea HT 
Si FO =-5 YC; OH pet) 4 DiS BO"), (9-169) 


The rate of work done on the system by means of the parameter \ and 
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the dissipative forces taken together is given by 


dW dd OV(Q, Se ae 
AY (Xt) = BON yj, FOU. (9-170a) 


i=1 


According to the first law of thermodynamics, the rate of heat transfer to 


the system from the thermostat(s) is then given by 


oe = 2 
g=H-W=-a(xX) OG, (9-170b) 


where a(X) gets determined by the constraint imposed on the system by 
the thermostat (e.g., the isokinetic or the isoenergetic constraint, refer 


back to sec. 9.9.3). 


The equations of motion (9-168) define a mapping (Xo0,0) — (X;,t) in 
the phase space where an explicit time dependence is, in general, in- 
troduced in virtue of the possible time-dependence of X(t) and Fo) Cie = 
1,2,---,N). A basic assumption of the theory is that there exists a revers- 
ing transformation R(X — X* = RX) under which the equations of motion 
remain invariant provided that one makes the transformation (t > —1), 
along with additional transformations of V({r;}, A(t)), Cij(X), Dij(X), Fi(t) (7 = 
1,2,---,N) which we do not state explicitly (see [127]). The reversing 
transformation is assumed to satisfy R? = I, and its existence implies 
that, for every trajectory (X,,s) (0 < s < t) (where the representative point 
is located at X, at time s) starting from (Xo,0) and continuing up to (X;,t), 
there exists a corresponding time reversed trajectory satisfying the equa- 
tions of motion such that the representative point is located at RX;_, at 


time s (0<s5<t). 
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The above criterion of microscopic reversibility is assumed to apply to the 
overall equations of motion (9-168), which implies that, if a bundle of 
trajectories is initiated from a volume element d'1X,) at time s = 0 and 
terminates in a volume element d!?!X, at time s = t, then the reversed 
trajectories start from a volume element d!?!X; = dl)X, (XxX; = RX,) at 
s = 0 and terminate in d'?)X% = d!lX, at time s = t. A bundle of for- 
ward trajectories and the corresponding reversed trajectories are shown 


in fig. 9-11. 


Figure 9-11: Depicting a bundle of forward trajectories and the corresponding 
time-reversed ones; the former are initiated at time s = 0 in a small phase space 
volume element d!?!X) around the point Xo and, after a time s = ¢, arrive at 
points around X;, spanning a volume element d!”!X;; the reversed trajectories 
start from a volume element d!?!_X,« around X; at time s = 0, and are terminated 
at time s = ¢ in a volume element d!?! Xj around Xj; here X* denotes the phase 
space point resulting from X by the application of the reversing operator R; 
the requirement of microscopic reversibility implies that d!?!x* = dl?lX, and, 
likewise, dl?) X% = d!1Xo. 


The notation d!?)X denotes a small volume element around the point X in 
the phase space, where D stands for the dimension of the latter (= 6N fora 


system made up of N number of particles.) 


1420 


CHAPTER 9. DYNAMICAL CHAOS AND CLASSICAL STATISTICAL 
MECHANICS: AN OVERVIEW 


If the initial (s = 0) probability density of the forward trajectories at Xo be 
denoted by p(Xo,0), and that characterizing the reversed trajectories at 
X; at time s = 0 be p(X7,0), then the condition for macroscopic reversibil- 
ity, which is the same thing as the conservation of probability under the 
switching from forward to reversed trajectories, can be expressed in dif- 


ferential form as 


p(Xo, 9) a J (Xo, Xf) p( X70), (9-17 1a) 


where J(Xo, X;‘) stands for the absolute value of the Jacobian determi- 
nant of the transformation Xj — X;, both taken at s = 0. However, by the 
definition of the reversing transformation, the phase space volume ele- 


ment d!"!X, at s = t is the same as the volume element d!”! X* at s = 0. In 


p(X0,0) 


other words, the ratio (y= 5) 


equals the expansion of the phase space vol- 
ume element from X, at s = 0 to X; at s = t. The latter can be expressed in 
terms of the phase space contraction rate y(X) (refer back to first equality 
in the second line of (9-112c)) along the trajectory from X = Xp at s = 0 to 
X=X,ats=t: 

N 


x(X.,8) = -A(X,,8) =- 2 | 


i= 


0 a. 5,] 
Or; t Op; Pi X=Xs5,s ’ 


(9-171b) 


where the time is mentioned explicitly because the equations of motion 
depend, in general, explicitly on time. In this expression, A(X,,s) denotes 
the phase space expansion rate along the trajectory in question. Referring 
to the phase space expansion ratio in a finite time ¢, i.e., the absolute 


value of the Jacobian determinant of the transformation from (Xo,0) to 
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(X;,t), one has 
t 
J (Xo, s=0Xgs = t) = exp (| A(X, s)ds). (9-171c) 
0 


In the light of all this, the condition for macroscopic reversibility can be 


stated in the form 


[condition for macroscopic reversibility] : 


Ane = exp ( [ace s)ds). (9-171d) 
(check this out), where it should hold for all trajectories allowed by the 
initial probability distribution p(Xo,0) and for all values of t, the time 
elapsed along any arbitrarily chosen trajectory. Evidently, a necessary 
condition for macroscopic reversibility is that, for any trajectory allowed 
by p(Xo,0), p(X7,0) should be non-zero for all possible t(> 0). This is 


referred to as the condition for the system to be ergodically consistent. 


In the absence of the thermostat, the system becomes adiabatically isolated, 
in which case its phase space contraction rate reduces to zero. In other 
words, the external fields F{*” (i = 1,2,--- ,N) are required to satisfy the 


condition 


2 Dij(X)FS] =0 (9-172) 


9.14.2 Evans-Searles FT: the dissipation function 


The Evans-Searles fluctuation theorem is formulated in terms of the 


probability distribution of various possible values of the dissipation func- 
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tion (Q;(Xo)) for any arbitrarily chosen value of t, where the dissipation 
function is defined with reference to the condition (9-171d) so as to pro- 
vide a quantitative measure of macroscopic irreversibility. The latter is an 
indicator of the excess of the left hand side of the equation over the right 
hand side, both being expressed in the logarithmic scale. In other words, 


we define 


_ 1, P(Xo, 9) ‘ 
Q(X) = In (X70) (| A(X,, 8)ds). (9-173) 


We recall the notation, where (X,,s) denotes the location of the evolving 
phase point (following the equations of motion, eq. (9-168)) at time s when 
initiated at X, at time 0, while X; denotes the phase point obtained from 
X, under the action of the reversing transformation R (R does not depend 
explicitly on t). We now focus on all phase points X, for which the value of 
the dissipation function, with t(> 0) fixed arbitrarily, lies in a small range, 


say, from y — dy to y+ dy, and define the probability density p(Q;) = y as 
Pp(% = 7) = [ aPX5(O(%) — 7) p(Xo, 0). (9-174a) 


One can write a corresponding relation for the probability density for Q; = 
— 7 where, however, we make a slight alteration in notation by observing 
that, in the above expression for p(Q; = y), Xo is nothing but a dummy 
variable of integration that can be replaced with any other variable. Thus, 


we write 


ple = =1) = f d)X75(04(X7) +9) 0X7, 0) (9-174b) 
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where we now interpret X; to be related to Xo by the operation of inver- 
sion. At the same time, we note from (9-173) that microscopic reversibility 


implies 
(Xf) = —O(Xo), (9-175a) 
(check this out) and, moreover, by the defining formula (9-173), 
o(X*,0)dP1X¥ = p(Xo, 0)d! 7X ec), (9-175b) 
One thereby obtains, by making use of the delta function, 
p(% = —y) = e77 / d\) X05 (OQ4(X0) — 7) (Xo, 0), (9-176) 


which, finally, implies the Evans-Searles fluctuation relation 


p(Q: = ¥) 


—_— =e’. 9-177 
p(Q = —7) : ( 


It may be noted that (9-176) is the more general form of the FT since (9-177) 


applies only if p(Q: = y) 0 


We use the terms ‘fluctuation theorem’ and ‘fluctuation relation’ synony- 


mously. 


The Evans-Searles FT compares the dissipation functions along trajecto- 
ries of an arbitrarily chosen duration ¢ originating from a given probabil- 
ity distribution in the phase space, and arrives at (9-177) for the ratio of 


probability densities for the dissipation function having values y and —y, 
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for any real value of y. It tells us that positive values of 7 are exponentially 
more probable (for large |7|) than negative values which, in turn, implies 
that (Q;) is positive (reason out why), where the averaging is with respect 
to the initial probability distribution p(X, 0) in the phase space. The only 
requirement to be satisfied by this probability distribution is that of er- 
godic consistency i.e., if p allows a trajectory originating at Xo at time 0 
(i.e., if p(Xo0,0) 4 0) then, for any given t > 0, a trajectory originating at 
X; = RX; should also be allowed (i.e., we must have p(X;,0) 4 0). Clearly, 
an equilibrium distribution such as the microcanonical or a canonical 


one constitutes a special instance that meets with this requirement. 


The dissipation function (2;(X) is, in a sense, an indicator of macroscopic 
irreversibilty since positive values of this function for any finite stretch of 
a trajectory in the phase space are more probable than negative values 
(for arbitrarily chosen X), and the mismatch of probabilities increases 
exponentially for large |Q;|. In other words, the positive definiteness of 
(Q;) can be looked upon as a statement explicating the second law of 


thermodynamics. 


While we have derived the Evans-Searles FT by referring to a determin- 
istic system, one can also arrive at the same fluctuation relation for one 
satisfying a set of stochastic equations of motion such as the Langevin 


equations. 


We also observe that the Evans-Searles fluctuation relation can be formu- 


lated by referring to a non-equilibrium steady state (which requires the 


(ext 
t 


external forces F‘“") to be time-independent) as was done in the case of 
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th Gallavotti-Cohen FT (sec. 9.12). However, the phase space probability 
distribution describing such a steady state is known to be a singular one 
(a SRB distribution), and one requires an approximation by assuming 
that the steady state distribution is obtained starting from an equilib- 
rium distribution by swithcing on the external forces at time, say, t = 0 
and ignoring the contribution of the transient state in the determination 
of 2°") (see [127]). With such an approximation in place, one obtains 
the following the steady state version of the Evans-Searles FT which is 
of an asymptotic nature analogous to the Gallavotti-Cohen FT (see para- 


graph following (9-145)) 


y= (9-178) 


where ©, stands for the time average of 9; over an interval ¢, and p(,) 


denotes the steady state probability distribution of ,. 


The Evans-Searles fluctuation relation is consistent with the Green-Kubo 


relations in linear response theory [127]. 


9.14.3 Crooks fluctuation relation and Jarzynski fluctu- 


ation relation 


In contrast to the Evans-Searles FT, the Crooks FT compares the proba- 
bility distributions of the work done on the system for two sets of trajec- 


tories originating in two different equilibrium distributions. 


Let W(Xo,t) denote the work done on the system along a trajectory of 
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duration t¢ initiated at X, (refer to (9-170a) for the rate of work done, 
i.e., for W = H), and let the system be initiated from an equilibrium 
state corresponding to a value \ = A of the work-parameter, described 
by the phase space probability distribution p,,(Xo,\ = A) corresponding 
to a canonical ensemble (such an equilibrium state is defined when the 
external forces FO) (i = 1,2,---,.N) are all zero and the thermostats are 
switched off; in contrast, the trajectory from s = 0 to s = t is described 
under the joint action of the time-varying work parameter, the external 
forces, and the thermostats). The distribution of possible values (w) of 


W (Xo, t) is given by the probability density 
rN =o) = [a Xo5(W(Xo,2) — W)Peq( Xo, = A). (9-179) 


We consider the ensemble of trajectories initiated from the equilibrium 
configuration corresponding to \ = A, evolving from time s = 0 tos =t 
(refer to equations of motion (9-168), where the potential energy depends 
on the time-dependent work parameter \(t)) when the work parameter 
acquires a value \ = B (say), though it is not necessary that the system 
arrives at an equilibrium state at time s = t. The sub-index ‘f on the left 


hand side of (9-179) refers to this ensemble of ‘forward trajectories’ 


This set of forward trajectories is now compared with the correspond- 
ing set of ‘reversed trajectories’ that are assumed to be initiated from a 
second equilibrium distribution p.,(X,\ = B) where the two equilibrium 
distributions (p.,(X, 4 = A) and p,,(X,\ = B)) are assumed to correspond 
to the same value of the inverse temperature (2. Let the probability den- 


sity for the value —w of the work performed on the system during a time 
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t along the reversed trajectories be denoted by p,(W; = —w) where, cor- 
responding to the value W; = w for the forward trajectories, we fix upon 
the value W; = —w for the reversed trajectories. Observing once again 
that the integration variable Xp in (9-179) can be replaced with any other 


variable which we choose to be X; in the present context, we can write 
p.(W, = —w) = ; d) X*5(W(X¥,t) + W)poq(X*, = B). (9-180) 


Referring to (9-170a), one can equate W(X, t) with Hat), ie., the rate 
of change of the Hamiltonian in the absence of the thermostat, which 


implies 


t 
W(Xo,t) = | ga rr ethane) x ACs) ): (9-181a) 
0 


The Hamiltonian being an even function under the time reversal opera- 


tion R (reason out why), the time integration implies 
W(X7,t) = —W(Xo,t). (9-181b) 


We now make use of the first law relation W(= H@*tic)) — Hf — q (refer 
to (9-170b)) where gq, the rate of heat supplied to the system, equals kpT’ 


times the entropy production rate (with a negative sign) and is given by 


G(X) = kpT A(X). (9-181c) 


The heat given to the system equals kgT times the entropy increase of the 


thermostat, with a negative sign, which is also kpT times the entropy pro- 
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duction in the system, again with a minus sign. Recalling that the entropy 
production rate equals the rate of phase space contraction, i.e., —A(X), one 


obtains (9-18 1c). 


Making use of all this, one obtains 


Wee —<"OGa== / (Xx, 8) — 4(X.)] 
= H(Xy.A= A) — HX A= B)+ hat fdsAX) 


(9-181d) 


The probability distribution p,.,(X7,\ = B) in (9-180) can now be ex- 


pressed in terms of /eq(Xo, \ = A) (eq. (9-179)) as 


, SV (phase-space) (Xo) 
6V (phase-space) (XF ) 


Z . 
Pea( Xi, = B) = F*pea(Xo, A= Ae (9-182) 
B 


where we have replaced e7 Jo (Xs) with the ratio of phase space vol- 
ume elements dV PhasesPace) (Xy) and dV Phasespace)(X,) and replacing the lat- 
ter with JV Pbasespace) (X*) (by microscopic time reversal); in addition we 
have used the fact that the Hamiltonian is even under time reversal 
(i.e., H(X/,A) = H(X%,A)). Finally, 74,7, on the right hand side stand 
for the partition functions pertaining to the canonical distributions for 


N=AA=B. 


We now substitute in (9-180), make use of (9-181b), and express the 


partition functions 74, Zs in terms of the respective free energies, so as 
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to obtain 
p(W, = —w) = AF) / d) X05(W (Xo, t) — w)Peq(Xo,A= A), (9-183) 


where AF = Fz — F4. Since the integral on the right hand side is nothing 
but p;(W; = w), we arrive at the Crooks fluctuation relation 
pr(W = w) B(w—AF) 
a : (9-184) 
pW = —w) 
where the time of duration of the forward and reverse processes is not 


mentioned since only the initial and final values of the work parameter 


are relevant in defining the forward and reverse processes. 


Rearranging terms in (9-184) and integrating over all possible values of 


w, one obtains 
[ewe Ppa =w)=e PAF ‘| dwp,(W =w), (9-185) 


where the integral on the right hand side is unity. This gives the Jarzyn- 


ski fluctuation relation 
Cuaay = @ PAF (9-186) 


which states that the average of e~’” taken over an ensemble of paths, 
initiated from an equilibrium distribution corresponding to some value, 
say, \ = A of the work parameter, and continued up to \ = B equals e~*4", 
where AF = Fz — F, is the free energy difference between equilibrium 


states for \ = B and \ = A. It may be noted that the forward process 
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referred to in the expression on the left hand side need not terminate in 
the equilibrium state corresponding to \ = B (the expression on the right 
hand side, however, does depend on the equilibrium state for \ = 8B), 
and the rate at which the forward process takes place is immaterial since 


what is relevant is only the pair of values A, B of the work parameter. 


The Jarzynski fluctuation relation is notable for the following reason: the 
second law of thermodynamics implies that the work performed on a sys- 
tem satisfies the inequality W > AF where the equality sign corresponds 
to a quasi-static process between the initial and final states. Thus, con- 
sidering an ensemble of paths between the initial and final states, one 
must have, on the average, (W) > AF. The Jarzynski relation tells us 
that the fluctuations of the work done must necessarily be such that, the 
average of the exponential e~°" equals e~°4".. This is of great significance 
in the experimental determination of free energies, especially for small 


(though macroscopic) systems. 


The fluctuation relations outlined in this book appear in various forms 
and under various conditions differing from one another in details, and 
hold for deterministic as well as stochastic systems, though the require- 
ment of reversibility is common to all the derivations. For more complete 
and precise expositions see, for instance [34], [22], [69] (the literature on 


fluctuation relations and related issues is quite voluminous). 
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Chapter 10 


Quantum Chaos and 
Foundations of Quantum 


Statistical Mechanics 


The content of this chapter is principally based on [26], [27], [131], and [56]. 


10.1 Foundations of quantum statistical me- 
chanics: introduction 


Quantum statistical mechanics of a system in equilibrium is based on 
the equilibrium ensembles introduced in sec. 2.1. The question arises of 
justifying these ensembles by referring to the Hamiltonian of the system 
since the latter completely determines its microscopic dynamics. In ad- 
dressing this question, we recall the considerations in the classical theory 


where a similar question was raised and discussed. It was observed that 


1432 


CHAPTER 10. QUANTUM CHAOS AND FOUNDATIONS OF QUANTUM 
STATISTICAL MECHANICS 


the ergodic hypothesis (or, its more precise mathematical formulation, 
the ergodic theorem) is sufficient to justify the classical microcanonical 
ensemble, which provides the basis of classical statistical mechanics of 
equilibrium states, and also of non-equilibrium processes close to equilib- 
rium states. Moreover, the microcanonical and the canonical ensembles 
imply the maximum entropy principle and thus explain the irreversible 
evolution towards the equilibrium state in consonance with the second 
law provided one confines oneself to the said near-equilibrium regime, 
where the principle of local equilibrium holds. In other words, the classi- 
cal equilibrium ensembles, along with the principle of local equilibrium, 
explain the macroscopic irreversibility ordained by the second law within 


the confines of this regime. 


The question that remains (in addition to the one of a foundational jus- 
tification of the second law) is that of explaining the macroscopic irre- 
versibility (in contrast to microscopic reversibility) for states in the far- 
Jfrom-equilibrium regime. As we have seen, the irreversibilty is explained 
in the context of non-equilibrium steady states in terms of the SRB dis- 
tribution where the chaotic hypothesis appears to play a role analogous 
to the ergodic hypothesis in equilibrium and near-equilibrium statistical 
mechanics. The transient fluctuation relations (refer back to 9.14), on the 
other hand, reveal the role of the dissipation function in explaining the 
irreversibility. Thus, pending further clarification of the SRB-based ap- 
proach and also of the transient fluctuaion relations in terms of actual 
physical processes occurring in the system of interest, these two do pro- 


vide at least a partial explanation of the irreversibility as stated in the 
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second law. 


Turning now to the quantum theory of macroscopic systems, a sufficient 
criterion for the justification of the microcanonical ensemble and for the 
explanation of the irreversible evolution towards the equilibrium state 
is provided by the eigenstate thermalization hypothesis (‘ETH’ in brief), 
based on the random matrix theory (‘“RMT), as we briefly outline in the 


present chapter. 


Recall that there also exists an alternative route for the explanation 
of the irreversible evolution towards an equilibrium state, based on 
a ‘contracted’ description where the system under consideration is 
assumed to be coupled to a larger system (a ‘bath’) and the degrees 
of freedom of the latter are averaged out, resulting in a reduced sys- 
tem description by means of a (generalized) Langevin equation or, 
alternatively, by a master equation. In a more general approach, 
the projection operator technique may be adopted to project out a set 
of ‘irrelevant’ observables so as to describe the evolution of a set of 
‘relevant’ ones where, once again, one arrives at a set of Langevin 


equations involving random ‘forces’. 


In this approach based on RMT and ETH, there is no explicit involvement 
of random forces or of a mechanism based on contraction from a larger to 
a smaller system, and one refers to the deterministic evolution of a sin- 
gle system from an initial state that can be a pure one (though a mixed 
initial state is also admissible) where it turns out that the eventual conse- 


quences of such deterministic evolution are equivalent to those resulting 
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from an equilibrium ensemble. The ETH, which is based on RMT, plays 
a role analogous to the chaotic hypothesis since it explains (in the above 
sense of equivalence) the evolution towards the microcanonical ensemble 
for a closed system and is consistent with the second law of thermo- 
dynamics. As we indicate below, it is consistent with the results of the 
linear response theory and, more generally, with the transient fluctuation 


relations as well. 


10.2 Quantum chaos and the random matrix 


theory 


10.2.1 Random matrix theory: introduction 


Quantum mechanics is distinct from the classical theory in that it does 
not admit of dynamical chaos in the same sense as the latter does (refer 
back to sec. 9.5). This difference between the two theories can be traced 
back to the linearity of the quantum mechanical evolution equations and 
to the uncertainty principle that puts a constraint on the description 
in terms of trajectories exploring arbitrarily small regions of the phase 


space. 


However, quantum and classical theories meet in the limit h — 0, and 
there appear definite signatures of chaos in the quantum mechanical de- 
scription of systems whose classical counterparts are chaotic. Such sig- 
natures retain a distinct identity even for values of h away from zero, 


notably in the form of statistical features of energy eigenvalues and eigen- 
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vectors. Recall that, in the classical theory, chaos implies a measure of 
dynamical complexity, as revealed in a non-zero value of the KS entropy 
of a system. In the quantum mechanical description, one commonly en- 
counters systems whose energy spectra and energy eigenvectors cannot 
be specified in simple terms, so that these are effectively described only 
by referring to their statistical features. More precisely, the latter turn 
out to be the same as the statistics of eigenvalues and eigenvectors of 
random matrices. For instance, the energy levels of nuclei resulting from 
complex interactions among a large number of nucleons turn out to pos- 
sess the statistical features of the eigenvalues of ensembles of random 
matrices where the ensembles are constrained by means of the overall 


symmetry characterizing the nuclear interactions. 


10.2.2 The Wigner-Dyson distribution of level-spacings 


In the absence of special symmetries (which is commonly the case for 
most systems with a large number of interacting particles), one distin- 
guishes systems that obey time reversal invariance from those that do 
not, e.g., those involving magnetic fields. As a simple example we con- 
sider, for the sake of concreteness, an ensemble of 2 x 2 real symmetric 


matrices (invariant under time reversal) of the form [26] 


A € 

H= va) (10-1a) 
V* 
V2 € 


with the parameters «,V(= V*) drawn independently from a Gaussian 


ensemble of zero mean and variance o. In this case, working out the 
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probability distribution for the separation (w) between the energy levels, 


one obtains 


2 


Pw)=—5 = 


552 exp ( _ ie: (10-1b) 


This distribution of the level separation is characterized by the feature 
of ‘level repulsion’ (P(w) — 0 at vanishingly small values of w) and that 
of a Gaussian decay at large separations. In the case of an ensemble 
with complex values of the parameter V (broken time reversal invariance) 
where its real and imaginary parts are again drawn independently from 


the Gaussian ensemble referred to above, one obtains 


(10-1c) 


The features of level repulsion and Gaussian fall-off at large separa- 
tions are apparent in this case too. More generally, these features are 
present in the level statistics of ensembles of matrices of arbitrarily large 
dimensions with elements drawn from a Gaussian distribution so that 
the probability of occurrence of the matrix H, with elements Hj; (i,j = 


1,2,--- ,D, say), where D stands for the matrix dimension, is given by 
: anes d 
P(H) x exp (— 55 Tr(H”)) = exp(— 55 S > His H;i). (10-2) 
9) 


Here a is a parameter that sets the energy scale (related to D) and y takes 
the value 1 or 2 for the case of time-reversal invariant or non-invariant 
matrices. These two matrix ensembles are referred to as the Gaussian 


orthogonal ensemble (GOE) and the Gaussian unitary ensemble (GUE) re- 
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spectively, since these are invariant under orthogonal and unitary trans- 
formations. The level spacing statistics (i.e., the statistics of separations 
between pairs of nearest energy levels) in the two cases are approximately 


given by 


[GOE :] P(w) = ou exp (— =u), 


; By een eee : 
[GUE :] P(w) = a exp ( a ). (10-3a) 


in each of which the mean level spacing is set at w = 1. The two expres- 


sions in (10-3a) are of the common form 
P(w) = Awte7 3+” , (10-3b) 


where the constants A, B are determined by the normalization condition 
and the mean level spacing. Formula (10-3b) is referred to as the ‘Wigner 
surmise’, which has been found to be closely followed by the actual level 


spacing statics of ensembles of large matrices. 


It was Eugene Wigner who charted the path to the random matrix theory 
which has since developed into a remarkably fertile field covering a vast 


area in physics and mathematics. 


Moving from the domain of large matrices to that of systems of particles 
following Hamiltonian dynamics, it was conjectured by Bohigas, Gian- 
noni, and Schmit (the ‘BGS conjecture’) that the level spacing statistics 
of systems whose classical counterparts are chaotic, is in accord with the 


Wigner surmise. A large number of studies has been found to support the 
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BGS conjecture and it is now established that the level spacing data for 
Hamiltonian systems of interest in statistical mechanics, including those 
for which classical analogs do not exist (fermionic systems for instance), 
conform to the Wigner-Dyson statistics, closely approximated by (10-3b), 
at least for sufficiently high energy states within relatively narrow win- 


dows. 


In other words, random matrix theory has, in a sense, moved ahead from 
the BGS conjecture in that, even for systems without a chaotic classical 
analog, the Wigner-Dyson level spacing statistics is taken as the signa- 
ture of quantum chaos. As we see below, the RMT makes statements not 
only about the statistical features of the energy spectrum of a quantum 
mechanical system but —- what is of no less importance —- about the en- 
ergy eigenvectors too, the latter providing the foundation of the eigenstate 


thermalization hypothesis. 


In contrast to the Wigner-Dyson statistics, one observes a distinct set of 
spectral features in systems whose classical counterparts are integrable. 
As conjectured by Berry and Tabor, the energy levels of an integrable 
system, being determined by a set of quantum numbers, follow a random 
sequence depending on various possible combinations of the quantum 
numbers, in consequence of which these follow a Poisson distribution, 
and the level spacing statistics (with the mean level spacing set at unity) 


is given by 


Pio)\=e™. (10-4) 
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The level spacing distribution following from the statistics of energy levels 
of an integrable system is at times referred to as the Poisson distribution 
for the sake of easy reference, though it is the latter (i.e., the distribution 
of energy levels rather than that of the level spacings) that follows the 


Poisson statistics. 


The Wigner-Dyson distribution of the level spacings and the distribu- 
tion (10-4) following from the Poisson statistics for the energy levels con- 
stitute two contrasting kinds of a ‘spectrum’ made up of possible types of 
level spacing statistics, where intermediate forms of level spacing distri- 
butions are also found to exist. For systems with classical counterparts, 
the intermediate forms of the level spacing distributions correspond to co- 
existing regions in the phase space characterized by regular and chaotic 
dynamics. The regular regions are made up of what are referred to as 
the KAM tori on which the system follows quasi-periodic dynamics, while 
the chaotic regions are made up of trajectories that acquire features of 


hyperbolic dynamics. 


Such mixed features make their appearance for systems whose Hamil- 
tonians are obtained by adding perturbations over integrable ones. If A 
denotes the strength of such a perurbation (i.e., the value A = 0 cor- 
responds to an integrable Hamiltonian), then it is generally found that 
the measure of the regular region in the phase space decreases with in- 
creasing values of \ (correspondingly, the measure of the chaotic region 
increases; these statements, however, are somewhat sweeping in nature 
and are, necessarily, not precise). This is generally reflected in the level 


spacing statistics, obtained numerically for various systems of interest, 
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Figure 10-1: Depicting schematically the level spacing distribution for (A) the 
Wigner-Dyson statistics, (B) the Poisson statistics; one observes the feature of 
level repulsion in the former and level clustering in the latter; (C) depicts a level 
spacing distribution of an intermediate type for a system with mixed phase space 
in the classical description; as the fraction of the phase space volume covered by 
regular trajectories on KAM tori decreases, with a corresponding increase in the 
fraction of phase space volume covered by chaotic trajectories, the intermediate 
distribution approaches the Wigner-Dyson graph more and more closely. 

where one observes a gradual shift in the former from a Berry-Tabor dis- 
tribution towards a Wigner-Dyson one. Fig. 10-1 depicts schematically 
the level spacing distribution for (A) a chaotic system, (B) an integrable 
system , and (C) a mixed system where ‘(C)’ appears as an interpolation 


between ‘(A)’ and ‘(B)’. 


Generally speaking, a family of KAM tori in the phase space is inter- 
spersed with chaotic layers since the tori with a set of commensurate 
frequencies break up because of resonance effects, thereby opening 
narrow windows of irregular motion; with the break-up of more and 
more tori, the irregular regions in the phase space keep on expand- 


ing. 


As mentioned above, the random matrix theory makes statements of sub- 
stantial relevance about the energy eigenvectors of a chaotic quantum 


mechanical system. This aspect of the theory may be approached along 
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the route of Berry’s conjecture (sec. 10.2.4), which throws light on the 


idea of quantum ergodicity. 


10.2.3 Examples of Wigner-Dyson statistics 


The Wigner-Dyson level-spacing statics have been verified in a large num- 


ber of systems many times over. 


The most well-known among these numerous corroborations of the Wigner- 
Dyson statistics are those relating to heavy nuclei, where data collected 
from slow neutron resonance or proton resonance experiments, among 
others, have been found to be in excellent agreement with the Wigner- 


Dyson prediction, as depicted schematically in fig. 10-1(A). 


A second set of well-studied data relates to the spectral lines of hydrogen 
atom in a magnetic field, where the rotational symmetry of the Coulomb 
potential is broken, with breaking of conservation of the total angular 
momentum. The phase space of the classical system admits of both 
regular and chaotic regions at relatively low and high energies respec- 
tively (a dimensionless energy is defined for the system in terms of an 
energy unit proportional to B3, where B is the magnetic flux density). 
One observes clear transition, in spectral data generated in numerical 
simulations, from a Poisson type distribution (analogous to the one in 
fig. 10-1(A)) to the Wigner-Dyson distribution, for low to high values of 
the dimensionless energy. For intermediate energy values, interpolating 


curves of the type depicted in fig. 10-1(C) are obtained. 


Analogous to the hydrogen atom in a magnetic field, one observes charac- 
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teristic spectral features corresponding to integrable and chaotic systems 
in the case of a particle in cavities of various different shapes. In the 
case of a rectangular cavity, one obtains a level spacing distribution at- 
tributable to a Poisson distribution of the energy levels in accordance 
with the Berry-tabor conjecture while in cavities with boundaries made 
up of straight line segments and circular arcs (the Bunimovich stadium 
and other related figures [26]), there appears a clear indication of the 


Wigner -Dyson level spacing distribution. 


Finally, we refer to interacting lattice models of bosons and fermions 
that have no classical counterpart, but which still exhibit a clear tran- 
sition from Poisson type statistics to the Wigner-Dyson distribution of 
the gaps between successive energy levels. As an example, we consider 
a one-dimensional lattice model of spinless (spin-polarized) fermions with 
nearest- and next-nearest neighbor interactions (coupling strengths V, V’), 
and nearest- and next-nearest neighbor hoppings (strengths J, J’), repre- 


sented by the Hamiltonian [126] 


A L “ 1 ws 1 es 1 7 1 
i> Ds [V(r = 5) hint = 5) + V'(fij - 5) j+2 - ) 
j=l 
7 I Eh fia a fisifs) = I (fl Fis a9 Pap): (10-5) 


In this expression, a are fermionic annihilation and creation operators 
at site j, satisfying the usual anti-commutation rules, n,; is the occupation 
number operator at site 7, and L is the number of lattice sites; periodic 
boundary conditions are assumed ( fi — ade fs = fix). The model pos- 


sesses an effectvely classical limit at low filling ratios (¥, where N is the 
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number of filled sites) and high energies, though quantum effects become 


important at higher filling ratios. 


The system defined by (10-5) is integrable for V’ = 0, J’ = 0 while higher 
values of these two parameters make it more and more chaotic. The level 
spacing distribution can be studied as a function of V’, J’ (the choice V’ = 
J' is convenient) and of the system size, with a given value of * such that 
quantum effects are important. It is found that there is a transition from 
Poisson type level-spacing distribution to one of the Wigner-Dyson type 
as V’(= J’) is made to increase, where the latter fits clearly with the GOE 
prediction for sufficiently large values of the system size (ZL), when the 
energy spectrum becomes dense, and RMT predictions are conformed to 
at relatively lower values of V’. In comparing the computed level-spacing 
statistics with the theoretically predicted values, it is required that the 
possible symmetries of the Hamiltonian be accounted for (for instance, 
the translational symmetry and other possible discrete symmetries) and 
the energy eigenvalues be computed in various different sectors, each of 


which is free of degeneracies. 


It appears that, for models of the type referred to, the transition to quan- 
tum chaos occurs at relatively lower values of the integrability-breaking 
parameter(s) (V’, J’ in the present example) as the system size is made 
to increase. This suggests strongly that, in the thermodynamic limit, 
such systems will exhibit quantum chaos even at an arbitrarily small 
strength of integrability breaking. However, for disordered systems ex- 
hibiting many-body localization, the transition to quantum chaos seems 


to appear at finite strengths of integrability breaking. 
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10.2.4 Berry’s conjecture 


Berry’s conjecture relates to the semi-classical limit of the energy eigen- 
functions of a system whose classical counterpart is chaotic, where the 
structure of the eigenfunctions is explored by looking at their Wigner func- 
tions. The Wigner function of a quantum mechanical state is defined over 
the phase space of the corresponding classical system. Considering a 


normalized pure state |w), its Wigner function is defined as 


WOOP) =aaae f aig - Fwy wigs De ire 
1 : P' Pp! i ; ; 
-aapw | oP (P— SWHYIP + >)e ae (10-6) 


where the notation is as follows. In continuation of earlier notation, Q 
stands for the collection of 3N number of components (V = number of 
particles making up the system under consideration) of the position vec- 
tors (r1,r2,---: , ty), taken in order; Q’ has a similar interpretation, but now 
as an integration variable in the phase space (the components of Q, Q’ are 
assumed to be in corresponding order); dQ’ denotes the 3N-dimensional 
volume element in the phase space (d®)r/, -.-d“)r‘,); P, P’ stand for the col- 
lection of momentum components conjugate to those in Q,Q’, and dP’ 
denotes the 3-dimensional volume element d“)p} ---d®) py. Additionally, 
a product of the form PQ is meant to stand for 5°, p;-q;. According to 
formula (10-6), the Wigner function is the Fourier transform of the lo- 
cal autocorrelation (in either the configuration space or the momentum 


space) of the wave function (Q|w) or (P|). 
More generally, the Wigner function for a mixed state represented by the 
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density matrix / is obtained from (10-6) by replacing |w) (| with p. 


The Wigner function resembles a probability distribution in the phase 
space. Though it can take up negative values in the phase space and is 
characterized by oscillations on a fine scale (of the order of / per degree 


of freedom), it satisfies, for pure state |W), 
fePwia.r)=lauye, f aqw@, P) =|(Plv)P, (10-79) 
and, for a density matrix /, 
| qaPw(Q.P) == ep = I, (10-7b) 


The Wigner function is used, in conjunction with the Weyl representations 
of observables, to represent quantum mechanical relations formally in 
terms of expressions relating to the phase space. Given an operator A 


representing an observable, its, Weyl representation is defined as 


il 


A(Q, P) = (anhBn 


[x(a - “1419 - yeh (10-8) 


where the tilde is used over the symbol A in the left hand side to indicate 
that it is a Weyl representation that is being referred to. The expectation 


value of A in the state f can be expressed, in terms of A(Q, P) and W(q, P) 


in the form 
(A) = Tr(Ap) = / dQdPW(Q, P)A(Q, P), (10-9) 


which represents (A) in the form of a classical phase space average. 
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With this introduction to Wigner functions, Berry’s conjecture can be 


stated as follows [26], [6]: 


In the semi-classical limit (h — 0), the Wigner function of a typical en- 
ergy eigenstate corresponding to eigenvalue FE (say), averaged over a van- 
ishingly small region of the phase space, reduces to the microcanonical 


distribution, i.e., 


Wars | ne W(Q, P) 


___ &(H(Q,P) - E) 
~ f dQdP6(H(Q, P) — E)’ 


dO rid p, | Pryd™ py 
Ay (enh)? 


(10-10) 


(recall that W(Q, P) stands for W(ri,pi,-:-: ,ryv,py)). In this expression, 
A; (j = 1,2,---,N) stands for a small 6-dimensional surface element 
around (r;,pi) in the phase space such that A; — 0 as h — 0 and, at 
the same time, 7 also goes to zero. Thus, W represents the Wigner func- 
tion for the eigenstate under consideration, smoothed over a region in 
the phase space in which the quantum mechanical uncertainty prevails, 
causing the Wigner function to oscillate. In the second line of the above 
formula, the delta-function in the numerator indicates that the smoothed 
Wigner function represents a uniform distribution on the energy surface 
H(Q,P) = E (where H(Q, P) is the classical Hamiltonian), i.e., the proba- 
bility distribution corresponding to the microcanonical ensemble. Berry’s 
conjecture has not been proved but there exists a large body of evidence 


in its support (detailed heuristic considerations regrading the structure 


of semi-classical eigenfunctions ~(Q) = (Q|w) are to be found in [6]). 
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In the case of an integrable system, the smoothed Wigner function 
is distributed uniformly over a 3N-dimensional torus in the phase 
space, the latter being determined by the value of the action integral 
corresponding to the energy FE of the semi-classical eigenfunction un- 


der consideration. 


What is notable in this conjecture of Berry’s is that it asserts an equiv- 
alence, in the semi-classical limit, between each individual energy eigen- 
state (at the level of the smoothened Wigner function) and the classical 
microcanonical distribution. It follows that, in the semi-classical limit, 
the expectation value of observables in an individual eigenstate coincide 
with their microcanonical averages, provided the classical counterpart of 
the system under consideration is chaotic and that the expectation values 


are evaluated by invoking an appropriate process of local averaging. 


The idea underlying Berry’s conjecture can be grasped by referring to a 
dilute gas of hard spheres confined in a box with a hard wall, for which 
the classical dynamics is strongly chaotic. A classical trajectory for a 
given energy typically consists of straight line segments with a fixed value 
of P? (defined below), whose direction keeps on changing randomly. In 
the quantum description, an energy eigenfunction 7~,,(Q) (for a sufficiently 
large E(= E,,)) can be represented as a superposition of plane waves of 


the form 


Un(Q) = Cy f dPAPI(S SP? — E,)er?®, (10-11) 


(dP = d°)p,d® p,---d® py, P? = p?+---+p) where n labels the individual 
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eigenstates and C,, stands for a normalization constant. What is of crucial 
relevance here is to note that the amplitudes A,(p) (which satisfy the 
reality condition A*(P) = A,(—P)) are random coefficients in the case of a 
system executing classically chaotic motion. In accordance with Berry’s 
analysis, these can be taken to be Gaussian random variables, for which 
the following holds, 


68%) (P — P') 
~ OP? =| PIP) 


(Am(P)An(P')) Ep = 6 (10-12) 


where the averaging on the left hand side is over the ensemble of eigen- 
states generated by the fluctuating amplitudes A,(P). The result (10-10) 
follows in the case of this model from (10-12) on making use of the 
definition (10-6), where the averaging over the regions A,,-,A,y in the 
semi-classical limit is to be replaced with (--)s_ mentioned above (see 


sec. 10.2.5 below). 


In other words, what Berry’s conjecture asserts in the semi-classical 
limit of a system possessing a chaotic classical counterpart, is mir- 
rored by the structure of the energy eigenvalues and eigenvectors of 
a system even away from the semi-classical limit provided that, once 
again, the system under consideration has a chaotic classical coun- 


terpart, the latter being precisely the content of the BGS conjecture. 


Further, starting with w,,(Q) one can work out the Fourier transform and 


obtain ¢,(P), and then obtain the single-particle momentum distribution 
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by averaging out the other degrees of freedom, 


fbr) = f dpa --d py (6,(P)n(P) ee (10-13) 


which works out to 


1 3 Pi Joe 


26 Imkpl (T= 
Sambo, c ° ( ” 3Nkp 


fn(Pi) = ( ); (10-14) 


in the limit N — oo, which is nothing but the Maxwell-Boltzmann mo- 


mentum distribution formula in an ideal gas at temperature T,,. 


One can also work out the fluctuations in the single particle momentum 
distribution, which is found to be exponentially small in V. What is more, 
if one puts in the condition of permutation symmetry or anti-symmetry on 
the \-particle wave function, one obtains, instead of the MB distribution, 
the BE or the FD distribution respectively [129]. This example of the hard 
sphere gas, which has here been considered away from the semi-classical 
limit, illustrates the content of Berry’s conjecture as explained in [7]: an 
energy eigenstate is a superposition of simpler states (plane waves in the 
case of weakly interacting systems even away from the limit h — 0; WKB 
type wave functions in the semi-classical case) with amplitudes that are 


effectively a set of random Gaussian variables. 


10.2.5 RMT: statistics of eigenvectors 


Having had a look at Berry’s conjecture, we now outline the statistical 
features of eigenvectors of random matrix ensembles. In the RMT ap- 


proach, an eigenvector is a statistically defined object: focusing on the 
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energy eigenvalues of any particular member of a matrix ensemble (gener- 
ically speaking, these eigenvalues form a non-degenerate spectrum) and 
the corresponding eigenvectors, one has to then refer to an ensemble of 
energy levels in the place of a single precisely determined one and, corre- 
spondingly, an ensemble of eigenvectors for that ensemble of eigenvalues. 
In other words, labeling the eigenvalues and eigenvectors (say, in the in- 
creasing order of energy) of a single matrix as E,, and |w,) (n = 0,1,2,---), 
one ends up, in the place of an individual eigenvalue E,, and the cor- 
responding eigenvector |w,), an ensemble {F,,} and the corresponding 
ensemble {|w,,)}. Here and in the following, when we speak of individual 
eigenvalues and eigenvectors we will mean these in this ensemble sense, 
where an ensemble average will, in general, be implied, this average hav- 


ing been denoted by the symbol (--)gr in sec. 10.2.4. 


As in the case of the eigenvalues, the RMT implies that the eigenvectors 
of random matrix ensembles are statistically defined objects. As men- 
tioned above, if E is an eigenvalue of any particular matrix belonging to 
an ensemble (either the GOE or the GUE) and |~) is the corresponding 
eigenvector, then the components of |v) (Say, v1, w2,--- ,wp) in any rea- 


sonably chosen fixed basis are distributed as 


D 
[GOE ] P(ay, Wa5**> Wn) x 50> 0; ~~ i), 


q=1 


D 


q=1 


where we now consider, instead of a particular matrix (possibly chosen in 


some biased manner) in the ensemble, a representative or a ‘typical’ one, 
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that may be imagined to vary within the ensemble, sampled uniformly. 
Note that the distribution does not depend on the basis chosen which 
means that within the RMT, all eigenvectors look alike in a statistical de- 
scription. For any particular matrix drawn from an ensemble (either the 
GOE or the GUE, with a given dimension D) one eigenvector may have 
components differing from those of another, but on drawing any other 
matrix, the eigenvectors will change in a random manner, subject only to 
the normalization condition expressed in (10-15). In the case of a system 
of N particles described by a Hamiltonian having a complex structure, 
the eigenvectors (referred to in some particular order, say in the order of 
increasing energy) are fixed ones, but nearby eigenvectors look alike and 
each can effectively be treated as a member of an ensemble (the eigen- 
vector ensemble over which the average has been denoted by the symbol 
(--)pp in sec. 10.2.4). It is this statistical uniformity that is reflected in 
Berry’s conjecture (10-10) (recall that in the semi-classical limit (--)gp is 
replaced with the averaging over the small regions A; (j = 1,2,--- ,N) ap- 
plied to the relevant Wigner function). For such a system (i.e., one with 
a specified Hamiltonian), the eigenvectors look alike in a statistical de- 
scription when referred to a basis that is not a special or exceptional one. 
For instance, in the basis defined by the energy eigenstates themselves, 
all the eigenvectors have very special structures, and there is no question 


of nearby eigenvectors looking statistically alike. 


10.2.6 RMT: more on the structure of eigenvectors 


As indicated in sec. 10.2.5, the eigenvectors of an ensemble of random 


matrices all look alike, in that each eigenvector has components evenly 
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distributed among the vectors of an arbitrarily chosen basis set, where it 
is assumed that the latter is not a specially selected one. One expresses 
this by saying that the components are delocalized with reference to the 
basis set. A measure of the delocalization of an eigenvector |a) (a = 
1,2,---) in respect of the set of basis vectors, say, {|m);m = 1,2,---}, is 


provided by the information entropy 
Sa=- > Iam? In Ig? (Cc@) = (mja),a = 1,2,---), (10-16) 


where the right hand side is to be interpreted in the sense of an ensem- 
ble average (refer to [80] for detailed considerations, which lie outside the 
scope of the present introductory exposition). The delocalization of eigen- 
vectors in the GOE results in the following approximate expression for 


the information entropy, 
So © In(0.4D). (10-17) 


Numerical studies on the lattice model (10-5) shows that the information 
entropy of eigenstates is bounded above by (10-17), but falls below the 
bound especially near the two edges of the spectrum, implying that the 
states at the edges remain ‘localized’. As for the states near the middle, 
the RMT value is approached as the integrability-breaking parameters 
V’, J’ and the system size (Z) are made to increase, when quantum chaos 


prevails. 


Another important prediction of the RMT is that the eigenstates of dis- 


tinct chaotic Hamiltonians, even ones ‘close to’ one another, are uncor- 
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related to a large degree. Thus, referring to (10-5), if we compare the 
eigenstates of two systems with slightly different values of V’, J’ (but, say 
with identical values of V,./) then one expects the following: referred to 
the basis formed by the eigenvectors of the system with parameter values 
V’, J’, the eigenvectors for parameter values, say, V’+ dV’, J’ + 6J’ will be 
essentially random vectors even when the increments dV’,6J’ are small, 
i.e., the eigenvectors of the perturbed system will have relatively large 
information entropies in the unperturbed basis. Numerical simulations 
indicate that the information entropies increase rapidly with the system 
size and that the unperturbed eigenstates get completely mixed up in the 
perturbed ones even for perturbations 5V’,6J’ exponentially small in the 


system size. 


10.2.7 RMT: matrix elements of observables 


We now consider the statistics of matrix elements of Hermitian operators 
in the RMT. Let A be such an operator with eigenvalues A, (assumed to 
be non-degenerate) and eigenvectors |a) (a = 1,2,--- ,D), where D is the 


dimension of the matrices in the ensemble under consideration, 


D 
Ala) = ala), A= S > aala) (al. (10-18) 

a=1 
What do the matrix elements of A look like with reference to the eigenvec- 
tors of a typical matrix within the ensemble (a GOE or a GUE) where each 


eigenvector (say, 


n) (n= 1,2,--- , D)) is again a member of an ensemble? 
We have already seen that, statistically speaking, all the eigenvectors |n) 


look alike. Hence, the diagonal elements A,,, = (n|A|n) have to be iden- 
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tical when considered within the ensemble of eigenvectors. As for the 
off-diagonal matrix elements A,» = (m|A|n) (m,n = 1,2,--- , D), expressing 
A|n) in terms of the eigenvectors |m) and invoking the statistical simi- 
larity of the eigenvectors, one finds that for sufficiently large D, these 
off-diagonal elements go to zero in comparison with the diagonal ones, 


once again in an ensemble sense. In other words, 


1 ‘ 
Ane is Qo, = A (say), 


D 


Apa =, Wen) ingen 1,222 D)s (10-19a) 


where the overline (--) denotes an average over the ensemble of eigenvec- 
tors (check the above results out; insert A = S>, aala)(a|). One can also 
work out the fluctuations of the matrix elements within the random ma- 
trix ensemble under consideration. Here one has the following results in 


the limit of large D, 


Bones 3 
Mode ey ee 


fee 
[Arn |? — |Arnn|” = po” (m#n), (10-19b) 


(check these results out; insert A? = >, a?2\a)(a|) Thus, for large D, one 


can write 


Amn = Aémn +4] =Rmn (m,n = 1,2,-++ ,D), (10-20a) 
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where, as defined in (10-19a) and (10-19b), 


A= = S~ aq, A? = ye, (10-20b) 


and where R,,,,, denotes a random variable with zero mean. For the GOE, 
all the elements R,,,,, are real while for the GUE, these are, in general, 
complex. In the former case the variance for the diagonal elements (Ry) 
is 2, and that of the non-diagonal elements is 1; while, in the case of the 
GUE, the variance of all the elements R,,,,, is 1. The validity of (10-19a) 
and (10-19b) can be seen to follow from (10-20a). 


It is once again necessary to distinguish between the results of the RMT 
where we deal with an ensemble of matrices and averages are taken with 
respect to the ensembles of eigenvalues and eigenvectors as the case may 
be, and those for a single Hamiltonian engendering a complex dynamics. 
In the latter case, averages are commonly taken with respect to nearby 
eigenvalues and eigenvectors. As we will see, the eigenstate thermaliza- 
tion hypothesis asserts an additional structure on the matrix elements 


Amn, aS compared with (10-20a). 


In the RMT context, the symbol D has been used to denote the dimension of 
the matrices in an ensemble, which is meant to correspond to the dimension 
of the Hilbert space for any specified system under consideration, commonly 


with D — oo exponentially with the system size. 
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10.2.8 Quantum chaos and delocalization 


Closely related with the delocalization of eigenstates predicted by the RMT 
and outlined in sec. 10.2.6 is the phenomenon of delocalization of the 
density matrix in the energy space of the system under consideration, 
as outlined below - a phenomenon analogous to the delocalization of the 


phase space probability density for a classically chaotic system. 


Recalling the case of a classical system isolated with a given energy, the 
delocalization in the phase space shows up in the form of a spreading of 
an initial probability distribution throughout the phase space, approxi- 
mating the microcanonical ensemble in a weak sense. Equivalently, if one 
focuses on a reduced distribution function for a subsystem, the delocal- 
ization shows up by way of the reduced distribution function approaching 


the Gibbs canonical ensemble. 


An analogous situation obtains in the case of quantum chaos. Start- 
ing with a system made up of N particles, we consider an initial non- 


equilibrium state generated by a quench. 


If a system is described by a Hamiltonian H then a quench is defined 
to be an operation that releases the system at some time, say t = 0, 
in a state described by a mixture of eigenstates of H, after which the 
system is assumed to relax for a long time under the evolution deter- 
mined by H. At times prior to the quench, the system is assumed to 
be described by a Hamiltonian H’ that may or may not be the same 
as H. With H’ differing from H, the system attains an equilibrium 


state characterized by H’ prior to t = 0 when the Hamiltonian is set to 
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H (by resetting a number of relevant parameters defining the Hamil- 
tonian), thereby effecting the quench. On the other hand, with H’ the 
same as H, the quench is applied at ¢t = 0 by a sudden change in the 
value(s) of one or more relevant variables, e.g., by applying a pulse to 


a magnetic chain that flips a number of spins in the chain. 


One then focuses on a subsystem (say, A), and looks at the time evo- 
lution of its reduced entropy (S,) defined as the von Neumann entropy 
of its reduced state (f,), the latter obtained as the partial trace of the 
density matrix (Trpf) resulting from a summation over the states of the 


complementary system (say, B). In other words 


PA = Trpp, 
(Ga)mn = > (malA|na) (m,n = 1,,2a,---), (10-2 1a) 
and 


Sa = —kpTr(pa In pa). (10-21b) 


In these expressions, 


m),|n) (m,n = 1,,2a,---) Stand for basis states for 
the subsystem A, while |a) (a = 1, 2,,---) are basis states pertaining 
to the complementary subsystem B, it being understood that the entire 
system under consideration is made up of A and B, the density matrix of 
the total (composite) system being denoted by /. In the case of a pure state 


IY) fie., p = fv) 


the entanglement entropy of the state |) with reference to the subsystem. 


), the above expression (eq. (10-21b)) is referred to as 
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A delocalization measure similar to the entanglement entropy is the sec- 


ond Renyi entropy defined as 
SO = —InTr~?. (10-2 1c) 


For a system characterized by quantum chaos, the second Renyi entropy 
(s@) coincides with the entropy of the corresponding thermal ensemble 
describing the subsystem provided the size of the latter is less than half 
the size of the total system. This checks against numerical simulations 
on a number of fermionic and bosonic lattices [26]. Similar results are 
obtained for the entanglement entropy when the subsystem size is small 
and the effective temperature describing the thermal ensemble for the 
subsystem is high. However, in the thermodynamic limit, the entangle- 
ment entropy is expected to coincide with the thermodynamic entropy, 
when the subsystem size is less than half of the size of the composite 
system, as indicated by arguments based on ETH (see sec. 10.3 for the 


basics of the eigenstate thermalization hypothesis). 


Another entropy-based approach to characterize delocalization in quan- 
tum chaos is to refer to the so-called diagonal ensemble. Once again, 
the classical context sets the background, providing insight and motiva- 
tion. Recall, from sec. 9.2.2 that the stationary behavior of a system at 
a large time t can be described in terms of the time-averaged probability 


distribution. We consider, for a finite f, 


1 t 
pix) == f drp-(%), (10-22) 
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where p,(X) is the probability distribution in the phase space that results 
by the Liouville evolution in time 7 from the initial distribution po(X). 
By making use of the convexity of the logarithm function (taken with a 
negative sign), one can see that the entropy of this averaged distribution 
is a non-decreasing function of time. In the case of an ergodic system, 
the entropy approaches microcanonical value at large times, while, in 
the case of a non-ergodic system, the microcanonical value may not be 


arrived at. 


In the quantum context, one refers to the time-averaged density matrix. 
The time evolution of the elements of the density matrix in the energy 


basis oscillate quasi-periodically as 
A(t)mn = A(0)mne Em Bn)t, (10-23a) 


where the notation is familiar by now. Assuming that the energy eigen- 
values satisfy the non-degeneracy condition, one concludes that the time- 
averaged density matrix eventually settles down to the diagonal ensem- 
ble, in which only the diagonal elements of the initial density matrix sur- 


vive, 
1 t 
po =e | D_palt) (n| (On = B(0)nn)- (10-23b) 


Since the diagonal ensemble is a stationary one (/p commutes with the 
Hamiltonian), one concludes that dephasing implied by the non-degeneracy 
condition (itself a consequence of quantum chaos) effectively brings about 


a stationary state that need not be the equilibrium state described by the 
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microcanonical ensemble, since the diagonal ensemble carries the im- 
print of the initial state. Indeed, the relaxation to the equilibrium state 
requires additional conditions summarized by the eigenstate thermaliza- 


tion hypothesis (ETH) to be introduced in sec. 10.3 below. 


The idea of dephasing and the dephased state, however, does not 
require the non-degeneracy condition as an essential pre-requisite; 


refer to sec. 10.4.1 


The delocalization in the energy space resulting in the diagonal ensemble 
for systems with quantum chaos can be looked upon as the counterpart 
of the delocalization in the phase space of a classical system as seen from 
the large time behavior of the time-averaged probability density. This 
analogy receives justification from numerical simulations of systems that 
possess classical counterparts where one finds that the resemblance be- 
tween classical and quantum systems persists even in deep quantum 
regimes [26]. The process of delocalization can be followed in time in the 
quench dynamics of a number of fermionic and bosonic lattice systems, 
in each of which the system is initially prepared in some eigenstate of an 
initial Hamiltonian and a quench is applied at some time t = 0 (Say) after 
which the system dynamics is determined by a different final Hamilto- 
nian. For a sufficiently large chaotic systems, the probability distribu- 
tion among the energy eigenstates of the final Hamiltonian resembles a 
Gaussian distribution. In the case of an integrable system, the diagonal 


ensemble looks distinctly different. 


Incidentally, the von Neumann entropy of the diagonal ensemble coin- 
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cides with the diagonal entropy (with reference to the Hamiltonian deter- 


mining the system dynamics), and is given by (refer back to (10-23b)) 
Sp = —kp S— pn ln pn. (10-24) 


With reference to a quench experiment where the initial state (assumed 
to be a pure one) |v) is allowed to evolve in accordance with a Hamilto- 
nian H, the diagonal entropy is nothing but the information entropy of 
|») in the basis provided by H. In the case of a chaotic system satisfying 
the eigenstate thermalization hypothesis, the dephasing involves a re- 
laxation to the microcanonical ensemble (assuming that the initial state 
is a superposition of eigenstates of H all lying within a narrow energy 
window) so that the diagonal ensemble then closely approximates the mi- 
crocanonical ensemble (refer, once again, to sec. 10.3 below). In the case 
of an integrable system, the diagonal entropy differs considerably from 
the microcanonical entropy, implying the absence of delocalization in the 


energy space. 


10.2.9 Quantum chaos: a brief overview 


While the microscopic dynamics of a system determines its macroscopic 
behavior, and while the business of statistical mechanics is to establish 
the connection between the former and the latter, statistical mechanics 
does not deal separately with each individual system, but rather tries to 
find the common feature of the microscopic dynamics of systems that 
can explain their macroscopic thermodynamic behavior. In the classi- 


cal theory the feature of ergodicity was identified by Boltzmann to be 
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a sufficient dynamical characteristic on the basis of which the program 
of equilibrium statistical mechanics could be launched. Non-equilibrium 
behavior, on the other hand, requires stronger assumptions on the micro- 
scopic dynamics if one is to explain the macroscopic features of systems 
in the far-from-equilibrium regime, especially the feature of irreversibility 
expressed by the positive definiteness of rate of entropy production. Anal- 
ogous to the ergodic hypothesis, the chaotic hypothesis (which includes 
the assumption of mixing dynamics), has been adopted as a sufficient 
assumption in this case. Remarkably, however, the transient fluctuation 
relations (sec. 9.14) do not require explicit assumptions regarding the 
chaotic nature of the underlying dynamics (requiring, instead, plausi- 
ble assumptions on the initial ensembles). In all these cases, the broader 
assumptions of ergodicity and chaos have not been established to be nec- 
essary ones, though it is generally accepted that the chaotic dynamical 
features somehow assume relevance for systems made up of enormously 


large numbers interacting particles. 


In the quantum theory, chaos is interpreted on a different basis — one that 
has nevertheless a correspondence with the classical concept of chaos in 
the semi-classical limit. The most general criterion of chaos in the quan- 
tum description is in terms of the random matrix theory. Though a given 
many-particle quantum system is described in terms of a single Hamil- 
tonian rather than an ensemble of matrices, the former is said to satisfy 
the criterion of quantum chaos if the energy eigenvalues and eigenvectors 
have statistical features resembling those of random matrix ensembles, 


especially those relating to the matrix elements of observables of the sys- 
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tem (refer back to sec. 10.2.7). This feature of resemblance to random 
matrix ensembles is often referred to as the criterion of quantum chaos 


and ‘quantum ergodicity’. 


However, quantum chaos in this sense is not sufficient to explain equilib- 
rium and non-equilibrium macroscopic behavior of a system. It is found 
that a certain additional structure is necessary for this, which is pro- 
vided by the eigenstate thermalization hypothesis (ETH), as outlined in 
the following sections. Once again, as in the case of the ergodic hypoth- 
esis and the chaotic hypothesis in the classical theory, the ETH is in the 
nature of a sufficient assumption and, given a many-particle Hamilto- 
nian, there is no route leading independently to (equilibrium and non- 
equilibrium) thermodynamic behavior without referring to the statistical 
features of the matrix elements of (some appropriate class of) observables 
as postulated in the ETH. Once again, one tacitly adopts the position that 
a many-particle interacting system presumably satisfies the ETH in the 


thermodynamic limit. 


10.3 The eigenstate thermalization hypothe- 


sis (ETH) 


Let us consider a closed many-particle interacting system with Hamil- 


tonian H, energy eigenvalues E,, and corresponding eigenstates |n) (n = 
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1,2,---), prepared in an initial (t = 0) pure state |W) given by 


lo) = S 7 Cain) (D0 ICnl? = 1). (10-25a) 


The state of the system at time t is given by 
I(t) = 5° Cre 8 In). (10-25b) 


Given an observable A, its expectation value at time t is 


A(t) = ((0)AW(®) = 7 Calum + O%,Cn ex [E(B — Ent] Aran 


n mn 


(10-26a) 


where we have separated the diagonal contribution from terms contain- 
ing the off-diagonal matrix elements 4A,,,,. The long time limit of the above 
expression is well defined (in the sense of an average) provided the spec- 
trum of eigenvalues ({£,,}) is non-degenerate (or the degeneracies, if any, 
are sub-extensive), where the contribution of the off-diagonal terms goes 


to zero and one obtains 
A(t) + $2 |Cn)?Ann. (10-26b) 


On invoking the eigenvector ensemble average and making use of the RMT 
result given in the first line of (10-19a), one finds that, within RMT, A(t) 
goes to A (first equality in (10-20b)), regardless of the initial state |19), ie., 
of the initial amplitudes C,, (n = 1,2,---) (the off-diagonal contributions 


are in any case small in virtue of the RMT result stated in the second line 
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of (10-19a), where these are seen to be exponentially small in the system 
size). This looks like an evolution towards an equilibrium state, but is 


actually too strong a result to be meaningful. 


As we know, for an equilibrium state described by a microcanonical en- 
semble, the diagonal matrix elements of an observable are all equal only 
for energy eigenstates within a narrow window (the off-diagonal matrix 
elements within this window are zero), while the RMT result implies that 
all the diagonal matrix elements are equal throughout the entire energy 
spectrum which effectively means that the system attains an infinite tem- 
perature equilibrium state. For real life systems, the long time average 
of an observable equals the microcanonical average that depends on the 
energy of the system, and the relaxation time also depends on the ob- 
servable under consideration (in contrast to the RMT result that the re- 
laxation time is essentially observable-independent). This requires addi- 
tional structure in the matrix elements of observables to be imposed on 


the RMT results and an appropriate interpretation of these results. 


The formulation of a definitive condition in this regard was made pos- 
sible by Srednicki [130], [131] (and other workers in the area; see, in 
particular, [27]) in the form of an ansatz, the eigenstate thermalization 
hypothesis (ETH), for which Berry’s conjecture (sec. 10.2.4) acted as the 
guiding principle. It may be mentioned that von Neumann’s ealy work 
on quantum ergodicity (see [58]) provided the basis for the important de- 
velopmemnts in the nineteen nineties on the foundations of quantum 


statistical mechanics. 
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Continuing to consider a closed quantum system made up of a large num- 
ber of interacting particles, we assume that the initial state of the system 
is a superposition of energy eigenstates with energies lying within a nar- 
row window (say, )£) around some value F. Then the long time average 
of the expectation value of any chosen observable A will be given by (refer 


to (10-26a), 


Astimt [wl =D ICPAm (10-27 
[5B] 

in which the contribution of the off-diagonal terms goes to zero because of 
dephasing (where the long time average of exp [+(Z,, — E,,)t] vanishes for 
E, # Em) and also of the RMT result on the off-diagonal matrix elements 
of observables. What is notable in (10-27) is that now the summation on 
the right hand side is confined to states within the narrow energy window 
dF around F (which we refer to as the ‘(microcanonical) energy shell’). We 
now assume, as a generalization of the RMT result, that all eigenstates 
|r) within the energy shell look alike in a statistical sense but that the 
statistics of the eigenstates depend smoothly on the energy, so that the 
diagonal matrix elements A,,, depend on the energy (£) of the shell to 
which the initial state |) belongs, while they are all the same for all 


states within the shell. Since the initial state is a superposition of energy 
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eigenstates within the shell, we have 


=A(E) ((say)), (10-28) 


where now the time average turns out to be a smooth function of the en- 
ergy (£) characterizing the microcanonical shell under consideration. In 
the above formula, VV stands for the number of energy levels (almost all of 
which are assumed to be non-degenerate) within the energy shell. Thus, 
under the assumption of the smooth energy dependence of the diagonal 
matrix elements A,,,, the time average of A(t) equals the microcanonical 


average. 


1. If the energy dependence of A, were to be ignored, then the above 


time average would be given by A = + dog Ga (check this out). 


2. In writing the first equality in (10-28), we have made use of the fact that 
the diagonal matrix elements are all nearly equal within the energy 


shell, and have replaced A,,,, in eq. (10-27) by the ensemble average. 


Srednicki proposed the ETH essentially as a generalization of the RMT 
within a thin window (the energy ‘shell’) around any specified energy (£) 
for a closed system, assuming a smooth energy dependence across the 
various different windows. The two major assumptions underlying the 
ETH are, first, a non-degeneracy condition and, secondly an ansatz for 


the matrix elements of any reasonable observable within a window. 
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The condition that the energy levels are to be non-degenerate holds for a 
system free of all unitary symmetries, while the following more restrictive 
condition is also expected to hold: if any two sums of equal numbers of 
energy levels are equal, i.e., if an equality of the following form holds for 


some positive integer n, 


Blog Loa Fo? ee = Fy sp Bg ee Bs (10-29) 


then the indices {7,--- ,7,} are a permutation of {a1,--- ,a,} (we assume 
that the indices correspond to states with given energies), i.e., the corre- 
sponding states on the two sides are the same. The non-degeneracy of 


energy levels follows as a special case of this more restrictive requirement. 


Another consequence of (10-29) is the non-resonance condition which 
states that all energy differences are non-degenerate. The necessity 
of this condition was underlined by von Neumann in his work on 


quantum ergodic theorem [58]. 


The second crucial assumption underlying the ETH is the following ansatz 
for the matrix elements of an observable between the energy eigenstates 


of the system under consideration: 


‘ = (B) 
(m|A|n) = A(E)Smn + 7 2 fa(E,w) Rann: (10-30a) 
In this expression, F',w are defined as 
ie = =e, Fi, (10-30b) 
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A(E) and f,4(£,w) are smooth functions of their arguments, R,,,,, stands 
for a factor (in general complex) whose real and imaginary parts are ran- 
dom variables with zero mean and unit variance, and S(£) represents the 
entropy at energy E. Confining our attention to an energy shell of width 
dE around EF, containing \’ number of energy levels, the entropy S(£) is 
given by S(F) = kglnN. This means that, in the RMT formula (10-20a), 
we have to take D = N = e*()/*e (recall that the ETH is a generalization of 
the RMT ansatz adopted separately for each of the narrow energy shells), 
which explains the factor ets. in (10-30a). Finally, it is the factor f, that 
determines the decay of correlations and the non-equilibrium behavior of 
the system under consideration as it moves towards the equilibrium state 
described by the microcanonical distribution, after being initiated in an 


arbitrarily chosen initial state within the energy shell. 


Here and in the following, we generally set h at unity, and introduce 


it in formulas when explicitly needed. 


In (10-30a), f4 can be chosen to be real, positive, and an even function 
of w without loss of generality [131]. As for the coefficients R,,,,, these 
satisfy, in general, R,,,, = R*,, while, if time reversal invariance holds, 


these are real and satisfy Rip, = Rim. 


The ETH ansatz reduces to the RMT result (10-20a) if the width of the 
energy shell (SZ) happens to be less than the so-called Thouless energy Ey 
of the system. The latter is related to the characteristic time 7 (Ey ~ t) for 
the system to respond to changes in the boundary condition imposed on 


it. For a diffusive system characterized by the diffusion constant D (the 
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overbar is added in order to distinguish from the matrix dimension D), 
one has Ey ~ _. where L stands for its characteristic linear dimension. 
Thus, in the thermodynamic limit, the ETH ansatz remains meaningful 
since one can choose a sufficiently narrow energy window without the 


function f4 becoming essentially independent of w as in the RMT. 


10.3.1 ETH and thermalization 


We summarize in this section a few results from [26], [131], [27]. A few gen- 
eral aspects of equilibration and thermalization in interacting many-body 


systems will be briefly reviewed in sec. 10.4. 


We have already seen that, given an initial state |W.) in a superposition of 
energy eigenstates |n) with energies within a narrow window around some 
energy E, the ETH ensures that the long time average of an observable A 
closely approximates the microcanonical value (refer back to (10-28)), 

‘i os 1 


(A) = fim +f WOIAWO) = FAB) (10-31) 


in establishing which we just had to assume a smooth energy dependence 
of the RMT value of the diagonal elements A,,,, (first term on the right hand 
side of (10-20a)), the latter being part of the ETH ansatz (10-30a). It may 
be easily verified that the same conclusion holds if the initial state is an 
incoherent mixture of pure states with energies confined to the narrow 
energy window, because all that is required in this regard is the long 
term decoherence of the various different energy eigenstates (in virtue of 
the non-degeneracy assumption and of the exponentially small values of 


the off-diagonal matrix elements). 
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This property of the long time averages of the expectation values of ob- 
servables going over to the respective microcanonical averages will be 
referred to as thermalization. We observe that, on the basis of RMT alone, 
the phenomenon of decoherence among the energy eigenstates contribut- 
ing to the relevant expectation values and the equivalence of all the diag- 


onal contributions, imply 
(A), © Tr(ppA), (10-32a) 


where fp is the diagonal ensemble defined by the initial state >°,,C,,|n), 
and where both sides of the above formula are approximated by A(£). The 
assumption of ETH along with the fact that the initial state is contained 


within a narrow energy window then implies 
Tr(fp.A) — Tr(fnA), (10-32b) 


where (\ stands for the microcanonical ensemble corresponding to the 


chosen energy window, 


. 1 
im=5 d In) (n], (10-32c) 


(check this out; refer to the first line of (10-28)). 
With the initial state |7)) chosen as mentioned above (a superposition of 
energy eigenstates contained within the energy window 6£), the variance 


in energy in the diagonal ensemble as well as in the microcanonical en- 


semble is of the order of (5£)?, which also sets the order of magnitude of 
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the variance of an observable A in the two ensembles. Finally, the ETH 
ansatz also leads to the result that (A); equals the canonical thermal av- 
erage ((A)7) of A to within O((5E£)?) (along with a correction term of the 
order of the inverse of the system size), where one need not go through 
the usual demonstration of the equivalence between the microcanonical 


and canonical ensembles (see [131]). 


It also turns out that the temporal fluctuations of A; = (w(t)|A|W(t)), given 
by 


1 t 
((A; — (A),)"), = iim al dt(A; — (A),)?, (10-33a) 

—> Co 0 

is exponentially small in the system size, 
AA, = ((A, — (A),)?)4 = O(e79), (10-33b) 


which means that, regardless of its initial value, an observable A even- 
tually approaches its equilibrium value (A), ~ A(E), and then stays close 
to it most of the time. This is a characteristic feature of thermal equilib- 
rium. The expression (10-33b), however, does not represent the thermal 
fluctuations of A. It turns out that the infinite time average of the mean 
squared quantum fluctuations of A is the same as the thermal fluctua- 


tions to within a term O(N~'), 
((qh(t)|(A — (A)e)? 1 (t)))e = (A?) — (A) + O((6E)?) + O(N). (10-34) 


Since (A’)r — (A)?, itself is of the order of N~', the above identification 
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holds to within a numerical factor of the order of unity. 


In summary, the ETH explains the phenomenon of thermalization of iso- 
lated quantum mechanical systems in a manner distinct from the way 
thermalization is explained in classical mechanics. In the latter case, 
one speaks of collisions and energy distribution among the different de- 
grees of freedom, while in the former, the ‘explanation’ rests on the phe- 
nomenon of dephasing since the features of thermalization are, in a man- 
ner of speaking, hidden in the structure of the individual eigenstates, 
which get eventually revealed in virtue of the time evolution. However, 
this is nothing but two distinct ways of describing the same process in 
differing contexts since, in the quantum description, the eigenfunctions 
themselves carry all the information present in the Hamiltonian. Instead 
of comparing pure states represented by points in the phase space with 
the eigenstates of the Hamiltonian, a more meaningful comparison would 
be between eigenstates in the HIlbert space and meandering trajectories 
in the phase space originating in distinct initial points, where all these 
trajectories look alike in a statistical sense in virtue of ergodicity. It is 


this similarity that is revealed in Berry’s conjecture. 


10.3.2 The quantum ergodic theorem 


The RMT and ETH can be looked upon as developments related to the 
earlier quantum ergodic theorem of von Neumann. Neumann’s work con- 
tains the germ of what is now referred to as pure state quantum statistical 
mechanics since he focused on the time-evolution of expectation values 


of observables in pure states that develop according to the Schrédinger 
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equation, and established the conditions under which such evolution 
leads to results equivalent to those implied by an eqilibrium ensemble. 
His analysis principally made use of the large size of systems of inter- 
est in statistical mechanics and, correspondingly, the enormously large 
dimensions of the Hilbert spaces made up of the pure states of these 
systems. The theory that he initiated made no specific reference to the 
chaotic features of the underlying dynamics and was, accordingly, neu- 
tral as regards the distinction between integrable and chaotic systems 
though he highlighted the relevance of the non-degeneracy conditions 
typical of chaotic systems. As we have mentioned in earlier sections, one 
way of making this distinction is by referring to the classical counter- 
parts of the systems under consideration. The other, more precise, way 
of making this distinction is to look at the statistical features of the energy 
eigenvalues and eigenvectors of the system, and of the matrix elements 
of observables in the energy basis, and to compare these with the predic- 
tions of RMT. The ETH goes further and assumes a certain structure in 
the diagonal and off-diagonal matrix elements of observables, as outlined 


in sec. 10.3. 


1. The ‘quantum ergodic theorem’ of von Neumann is similar in spirit to 
the classical ergodic theorem and to more recent work in the founda- 
tions of quantum statistical mechanics. von Neumann was an early 
exponent of the tradition of focusing on expectation values of observ- 
ables rather than on the evolution of states of a system — an approach 
that was continued in much of subsequent work on equilibration and 
thermalization (see sec. 10.4). Indeed, a state can be defined in terms 
of expectation values of a set of relevant observables. As in other ar- 


eas, von Neumann’s physical and mathematical insights have proved 
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far-reaching in the context of subsequent developments. 


2. ‘Pure state quantum statistical mechanics’ considers the long term 
evolution of pure states of a system (and the associated evolution of 
expectation values of observables), though mixed states may also be 
admitted within the same framework. It makes use of standard prin- 
ciples of quantum mechanics and, at times, focuses on finite dimen- 
sional systems so as to bring out the essential features of the time 
evolution in relatively simple terms. A number of basic principles and 
concepts of quantum information theory are invoked in order to add 
precision to the theory. We will, however, gloss over most of the tech- 
nicalities in the present section and in sec. 10.4 below where the latter 
summarizes a number of basic results in equilibration and thermaliza- 
tion — processes of fundamental relevance in the statistical mechanics 


of interacting many-body systems. 


Considering a many-body system, we focus on a narrow energy window 
(a ‘microcanonical energy shell’) containing energy levels E,, (n = 1,2,---), 
and the span of all the corresponding eigenstates |n), referring to the 
latter as the relevant Hilbert space H (a subspace of the overall Hilbert 
space). The restriction of the Hamiltonian of the system to H will be taken 
as the effective Hamiltonian (H(= )>, E,|n))) in the context of the energy 
shell under consideration. Even as we consider a window of small width, 
the dimension (D) of the effective Hilbert space will be assumed to be 
exponentially large in the system size (V), which itself is an enormously 


large number. 


We now assume that H is the direct sum of orthogonal subspaces 6,, of 
respective dimensions d, (a = 1,2,---) where each of the d,’s, while small 


compared to D (= 5°, d,), is an enormously large number in its own right. 
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Let P, be the projection operator on to B,. A macroscopic observable A in 


H can be expressed as a sum 
A=)_ AgPa, (10-35a) 
where the numbers A, (a = 1,2,---) define the observable in question. 


The orthogonal subspaces 6, are assumed to correspond to ones 
made up of common eigenvectors of a set of commuting operators 
BO, Be), BUA) (say), where distinct subspaces B, are associated 
with distinct sets of eigenvalues {oi , o?) vee Oy: These operators 
are obtained as approximations to nearly commuting operators BO, BO), ree, BA), 
representing macroscopic observables whose principal characteristic 
is that their eigenvalues are highly degenerate. Each of the subspaces 
6B. can then be assumed to correspond to a macroscopic state of the 
system under consideration, represented in terms of (approximately 


specified) values of the macroscopic observables. 


The expectation value of the observable A at time t is given by 
A(t) = (ble Ae- |x), (10-35b) 


where |w) is the state in the Heisenberg picture (i.e., the initial state |q) 


A 


in the Schrédinger picture), while its microcanonical average ((A)) is 
: 1 : 
(4)= 5 Delle) (10-35c) 


The central question to be addressed is then: in what sense can A(t) be 
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said to be close to the microcanonical average? In view of (10-35a), this 
can also be re-framed in terms of the projectors P, (a = 1,2,---) by asking 


whether, for all a, 


Q 


(el alte) 55 (Wx) = elo), (10-36) 


holds for most times t. 


It is assumed that the energy spectrum within the shell under consider- 
ation (it may be mentioned that the theorem to be stated is to apply to all 
possible energy shells of the system) satisfies the non-resonance condi- 
tion that all energy differences (not necessarily the gaps between nearest 
levels only) are to be non-degenerate (degeneracies, if any, are assumed 
to be accidental). One further assumes that, for each of the subspaces 
B., the following quantity 


F,[H, B) = max |(m| P,|n) |}? + max((m|P.,|m) — ree (10-37) 


is exponentially small in the system size. Here 6 denotes the family of the 


subspaces B, (a = 1,2,---). 


Subject to these two conditions, one can derive that there exist 7,«, both 


sufficiently small, such that 
|A(t) — (A)? < €{A?), (10-38) 


for all but a small fraction (7) of any given interval 0 < t < T. Equiva- 


lently [26], (10-36) is satisfied most of the time. 
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A system defined by Hamiltonian H, Hilbert space H, a decomposition of 
H into orthogonal family of subspaces B (all with reference to any arbi- 
trarily chosen microcanonical energy shell), is said to be ‘normal’ for a 
state |.) if (10-38) (or (10-36)) holds for that |W). The quantum ergodic 
theorem further states that a sufficiently large interacting system (i.e., a 


‘macroscopic’ one) is normal for all |) for most choices of the family B. 


The quantum ergodic theorem (QET) implies the following results for the 


matrix elements of a macroscopic observable A in the energy basis 


(m|A|n) = S~ Ag(m|Paln) * 0, (10-39) 


the expression on the right hand side of the second line being exponen- 


tially small by the QET condition on F,, defined in (10-37). 


In other words, the QET is in accord with the RMT predictions as out- 
lined in sec. 10.2, the latter being equivalent to the ETH ansatz within 
the Thouless energy gap where the envelope function f4(E£,w) is struc- 
tureless. Without this restriction to the Thouless energy gap, the QET 
condition on F,, cannot be satisfied. The QET is crucially dependent on 
the fact that the overlaps between the energy eigenstates and the eigen- 
states of macroscopic observables are exponentially small. The ETH, on 
the other hand, goes beyond the QET and the RMT because it goes beyond 
the structureless Thouless energy gap in exploring the time evolution of 


the system under consideration. 
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10.3.3 Tests for ETH 


The ETH has been verified numerically in a variety of interacting lattice 
models of diverse decriptions ranging from condensed matter to ultracold 
quantum gases. Early verification came for a two-dimensional system of 
hard-core bosons, and similar corroboration of the hypothesis has sub- 
sequently been obtained for models of soft-core bosons, for interacting 
chains of fermions with and without spin, and for the transverse field 
Ising model in two dimensions. The ETH ansatz has been verified for 
the diagonal matrix elements of observables and, independently, for the 


off-diagonal matrix elements as well (see [26] and references therein). 


One of the models studied is the one with the Hamiltonian 


ea L n 1 x 1 he 1 . 1 
ff = [V (A, — 5) (jan = a +V'(Ai; - 5) (ise = 
j=1 
_ (bby 41 F bt .b,) _— J (bibj 42 + bt 4b,)| Fi (10-40) 


which describes a chain of hard-core bosons (analogous to the fermion 
chain described by (10-5)), where b,, bi are bosonic annihilation and cre- 
ation operators at site 7, 7; is the occupation number operator, and L is 
the number of lattice sites; periodic boundary conditions are assumed 
(b, = br sa, bo = bras): Once again, non-zero values of V’, J’ imply the break- 
ing of integrability, while an increase in these parameters (a common 
choice is V’ = J’) leads to a predominance of quantum chaos. Diago- 


nal matrix elements in the energy basis are computed for a number of 
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operators [26], including the operator for the kinetic energy per site 
ie a _ a 
K = D1 [= IOpbjer + 8h .1b;) — I (Opie + Bh obs)]. (10-41) 
j=0 


It is found that, at integrability, (J’ = V’ = 0) the diagonal matrix elements 
in the energy basis (i.e., K,,, for various eigenstates |n)) fluctuate errati- 
cally for nearby eigenstates, where the spread in these matrix elements 
does not change appreciably with increasing system size. As one goes to 
larger values of V’ = J’ and, as the system size is made to increase at 
some fixed value of V’ = J’ # 0, the fluctuations of the diagonal matrix 
elements for energy values close to one another decreases sharply. These 
observations are seen to persist for various different energy windows in 
the spectrum, except at energy values close to the spectrum edges. Sim- 
ilar results for a number of other systems provide ample support to the 
statement that the first term on the right hand side of (10-30a) (with A(£) 
representing the microcanonical average at energy F) correctly represents 


the diagonal matrix elements of observables in the energy representation. 


A similar conclusion is arrived at on the basis of fine-tuned computa- 
tions for off-diagonal matrix elements of observables in numerous chaotic 
models defined by parameters, of which special values correspond to in- 
tegrable systems. One observes that, in the chaotic case, the off-diagonal 
elements indeed vary smoothly with energy differences within any cho- 
sen microcanonical shell and are much smaller in magnitude compared 
to the diagonal elements in the same model and within the same shell, 


in conformity with the second term on the right hand side of (10-30a). 
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In contrast, the off-diagonal matrix elements show a much greater vari- 
ability within any chosen energy shell in the integrability limit (a large 
fraction of the off-diagonal elements turn out to be zero while some are 


seen to be of substantially large values). 


On computing, for a given observable A, the function f4(£,w) appearing 
in (10-30a) as a function of w within any chosen energy window, one finds 
three distinct ranges of w with the following characteristics: (1) for large 
w, f4 decays exponentially, with little dependence on the system size, and 
an excellent fit with the ETH ansatz is apparaent; (2) for intermediate 
values of w, f4 scales with the system size (L, refer to (10-5), (10-40)) 
as L2, and its plot is found to have a broad peak whose position on the 
w-axis scales as L~'; (3) for low w, the plot shows a plateau, where /, 


appears to scale as L2, the plateau width being ~ L~?. 


The three regimes of variation of f4(£,w) corresponding to high, interme- 
diate, and low values of w determine the nature of time-evolution of the 
expectation value of the observable A at short, intermediate, and long in- 
tervals of time. The short time regime corresponds to energy absorption 
in many-body processes that are accounted for in higher order pertur- 
bation theory. The intermediate regime corresponds to a ballistic time 
evolution. Referring to either of (10-50c) and (10-50a), and making use 
of the fluctuation-dissipation theorem one obtains a slow diffusive time 
evolution at large times. Significantly, at frequencies smaller than a cer- 
tain characteristic frequency w, ~ 7, fa(E,w) is found to saturate to a 
constant value ~ L2. This corresponds to the situation where the ETH 


and RMT predictions converge and the time evolution of the observable 
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reaches saturation. 


Numerical simulations of the dynamics of observables after local and 
global quenches in strongly correlated lattice models have produced in- 
formation on relaxation time scales where there appears a broad similar- 
ity across systems of diverse descriptions. A time scale of relevance is 
the one in which a system approaches a configuration that is effectively 
equivalent to a diagonal ensemble. Additionally, the simulations enable 
one to address the question as to the characteristic system size for which 


the diagonal ensemble is equivalent to the microcanonical one. 


1. A local quench is one where system parameters pertaining to only a 
finite region of an extended system are changed, while a global quench 


involves a change in parameters pertaining to the entire system. 


2. In the case of quantum quenches involving local Hamiltonians, it turns 
out that the width o£ of the energy distribution after the quench scales 


as the square root of the system size (65E ~ N72). 


Typically, the system parameters are chosen in such a way that the ex- 
pectation value of the observable (A(t)) is relatively large in magnitude at 
the time of the quench, whereafter it is found to decrease rapidly and, 
finally, to oscillate around a small value. With increasing system size, 
the magnitude of A(t) about which the oscillations take place, is found to 
decrease along with the amplitude of oscillations.The numerical data lead 
to the conclusion that, despite the exponentially small level spacings (in 
the system size) in a many-body system exhibiting quantum chaos, the 


time of relaxation to an effectively diagonal ensemble is not exponentially 
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large. These data are also in agreement with the ETH result that the vari- 
ance of the time fluctuations of A(t) about the equilibrium value is expo- 
nentially small in the system size (refer back to (10-33b)). In other words, 
even relatively small finite systems relax to a configuration to which the 
predictions of the diagonal ensemble apply, and then continue to remain 
close to it, this being commonly referred to as thermalization in the strong 


sense. 


It remains to inquire as to how far the predictions of the diagonal en- 
semble agree with those of statistical mechanics. It is here that the ETH 
has to play a specific role since it predicts that the observables of in- 
terest eventually relax in accordance with the requirements of statistical 
mechanics — ones corresponding to the microcanonical ensemble for the 


system as a whole and the canonical ensemble for a subsystem. 


In numerical simulations on fermion chains such as the one in (10-5) and 
boson chains as in (10-40), one computes AA, the difference between the 
diagonal ensemble expectation value and the microcanonical value as a 
function of time, for appropriate operators A and for various different val- 
ues of the parameters characterizing the system. One observes that AA 
remains rather large close to the integrability limit and then decreases 
rather sharply for parameter values in the quantum chaos regime. On 
moving further into the chaotic regime, one finds that AA remains more 
or less independent of the parameter values, and decreases with an in- 
crease in the system size, in conformity to the ETH predictions. Sig- 
nificantly, the absence of thermalization in the case of a quench to an 


integrable limit is found to be quite ubiquitous. 
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In summary, numerical simulations indicate that the ETH predictions on 
thermalization are complied with by large classes of systems character- 
ized by quantum chaos, while integrable and near-integrable systems 


deviate substantially from these predictions. 


Based on evidence provided by a large body of numerical experiments, 
the ETH is now widely accepted as the theoretical framework for the ex- 
planation of thermalization in isolated quantum mechanical systems. It 
appears that quantum chaos is a central requirement for the validity of 
the hypothesis, and the fact that thermalization is ubiquitous in nature is 
indicative of the ubiquity of quantum chaos in thermodynamic systems. 
However, disordered systems characterized by many-body localization in 
real space do not conform to the ETH, which involves delocalization in 
the energy space. Such systems, however, may exhibit a transition be- 
tween ‘non-ergodic’ and ‘ergodic’ phases across the so-called ‘mobility 
edge’, with the ETH being valid for the ergodic phase [116], associated 


with diffusive motion in real space. 


To date, there is no rigorous proof of the validity of the ETH, though there 


are ample theoretical indications as to why ETH should hold. 


10.3.4 ETH: the time course of relaxation 


As an isolated system is released from an initial non-equilibrium state, 
say, by the application of a quench, it follows a course of relaxation to 
the thermal equilibrium, the general nature of which can be worked out 


by invoking the ETH. 
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Evidently, the details of the relaxation process will depend on the initial 
state and the Hamiltonian of the system concerned, and one can describe 
the course of relaxation only in probabilistic terms. Here one can start 
from some appropriately chosen ensemble of initial states, in which case 
the relevant probability determining the course of relaxation will depend 
on which particular initial state is drawn from the ensemble. On the other 
hand, one can determine the state at some time ¢(> 0) by starting from 
a given initial (t = 0) state and take the time ¢t as a uniformly distributed 
random variable, so that one can now talk of the probability of a state 
at time time t, induced by the probability distribution of t. As usual, 
the macroscopic state under consideration will be identified in terms of 


expectation values of observables. 


Referring to an observable A and an initial state |) (determining Ay = 
(w9|AlwWo)), we will be interested in the probability distribution for A, = 
(«,| Alu) corresponding to any chosen value of t from the probability dis- 
tribution (assumed to be uniform) over time. The relevant mathematical 
objects here will be various moments of the probability distribution for 


At, 


(Aye = lim 2 f at(ay”. (10-42) 


THO T 0 


With |W) = >2,,Cn|n) (eq. (10-25a)), these moments are all determined in 


terms of the matrix VM with elements 
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where one needs to work out products of traces of powers of this matrix. 
Evaluating the moments (A;)” (n = 1,2,---) involves a bit of algebra, and 
I state below the principal results of the analysis (see [131] for an outline 


of the derivation). 


For the sake of convenience we redefine the observable A such that A; = 0. 
Making use of the non-degeneracy assumption mentioned earlier, one 


obtains the first few moments as 


A, =0, A2=TrM?, A? = 2TrM?, 


A?’ = 3(TrM?)? + 6TrM4, A, = 20TrM2TrM? + 24TrM®, 


ceeeed (10-44) 


It is possible to estimate the magnitudes of these moments in terms of 
a number of characteristic parameters relevant to the system. Recalling 
that the initial state is a superposition of energy eigenstates belonging to 
a narrow energy window around a mean energy LE, one of these relevant 


parameters is the mean energy spread introduced earlier, 
B=(y 67 G—2))". (10-45a) 


The next characteristic quantity of interest is the energy width (relative 


to w) of the envelope function f4(F£,w) (refer back to (10-30a)), 


f°, dl fa Bw)? 
0S TEL OP 


(10-45b) 


One also needs, in addition to the energy spread dE, the typical size (6A) 
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of the quantum (and thermal) fluctuations in the state |w/9) 
5A = (So IGnP (Ann) 2, (10-45c) 
and inverse participation ratio 
1 
L= —— 10-45d 
d. CAL ( ) 


giving the effective number of levels in the state |v) defining the en- 
ergy window under consideration. Defining 6.¢ = “ as the effective level 
spacing within the energy window of interest, one expects the following 


ordering to hold among these parameters, based on physical grounds, 
bet << W << dE << E(~ kpT). (10-45e) 


Given the observable A, the matrix M of (10-43) can be looked upon as a 


D x D banded random matrix, where D ~ ee and the bandwidth is ~ ie 
and where, within this band, the value of the typical entry is ~ +-4- 


(refer back to (10-45c)). = 
With these considerations in place, one can work out the orders of mag- 
nitude of TrM* for various powers k (the estimates differ for odd and even 
powers). Then, knowing the moments of the probability distribution for 
A;, one can work out an estimate for the probability distribution (P(A,)) 
itself. It turns out that, for A; sufficiently close to the equilibrium value 


(A: <S \/ 04, see below; recall that we have assumed the equilibrium 
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value to be shifted to A, = 0) the probability distribution is a Gaussian, 


OE, A: 


P(A,) ~ exp (— f(a) (10-46) 


where € is a factor of the order of unity and 6 stands for the mean level 
spacing near F. This tells us that, for sufficiently small values of A, the 
probability distribution has ‘universal’ features, independent of the de- 
tails of the initial state (except for the energy width 6F). Further, since the 
mean level spacing is exponentially small in the system size, one obtains 
again the earlier result that temporal fluctuations about the equilibrium 
value are suppressed by a factor os wee 

For relatively larger values of A; (= 6A where, typically, 6.¢ ~ 6), the prob- 


ability distribution acquires non-universal features. 


Having got at the probability distribution of A,, we now ask the question 
— what is the conditional probability for A;, subject to a given value (Ao) 
at t = 0? This conditional probability (P(A;|Ao)) is obtained in terms of 
the joint probability P(A;, Aj), where the latter can be evaluated from mo- 
ments of the form (A;,,)”(A;)", and where the averaging is with respect 
to t’, with t held fixed. On expanding these moments in a manner analo- 
gous to (10-44) and retaining only the dominant terms in the expansion, 


one obtains a Gaussian distribution for P(A;,A)) analogous to (10-46), 
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where the only relevant terms are 


Ace = Y |My [tet En fo" 


(Any)? = [Mal (10-47) 


Defining the correlation function 


; (10-48) 


which is normalized to C(0) = 1, one ends up with the estimate (refer, 


once again, to [131] for further details) 


5E (A, —C(t)Ao)? 
5 (1—C(®)2(6A)? 


P(A,|Ao) « exp | —€ |, (10-49) 


in the universal regime (A; < (io). Since oe ~ e°, it follows that, 
close to equilibrium, A; is overwhelmingly likely to hover round the value 
C(t) Ag. In other words, the time evolution of the system under considera- 
tion is, to all intents and purposes, deterministic, being solely dependent 
on the initial value Aj. As we see below, the correlation function C(t) does 
not depend on the initial state, being determined by the envelope func- 
tion f4(E£,w). In the non-universal regime (A; = (254), where A; is away 
from equilibrium, the evolution is once again found to be approximately 


deterministic, though not necessarily of the form A; ~ C(t) Ao. 


As for the correlation function C(t) describing the time course of relax- 
ation in the universal regime close to equilibrium, an estimate can be 


obtained by referring to (10-47) and (10-48), with ™,,,, given by (10-43). 
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On making use of the envelope function f,(£,w) introduced in (10-30a), 


one obtains 


C(t) ec > |Gral?|Cn|?| Aran|2e8 2-2)" 


x f du| fa(E, w)|?e™, (10-50a) 


oe) 


which tells us that, as expected, C(t) does not depend on the initial state, 
and that the bandwidth W of f4(£,w) sets the time scale of C(t), i.e., (es- 
sentially) the relaxation time in the universal regime. This can be com- 
pared with the quantum mechanical Kubo correlation function defined 
in (8-180) which, in the present context, reduces to (with B=A) 

co gj 


inh 57 bs 
K a(t) x | duu | fal Bw) | em. (10-50b) 


(oe) 


Put differently, the envelope function f,4(F,w) can be expressed in terms 
of the Fourier transform of the unequal time auto-correlation function of A 


(i.e., the spectral function) as [116] 


|fa(E,w)/? =e ° S e"[(n| A(t) A(O)|n) — (n|A(0)|n)"}. (10-50c) 


One observes that if the bandwidth W of f4(£,w) is less than kgT and 
the falloff for w >> W is sufficiently sharp so as to make the integral 


in (10-50b) converge, then 
C(t) = Kaa(t) + O(—z)- (10-50d) 


which means that the ETH correctly reproduces the time course of re- 
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gression of fluctuations as predicted by Onsager and as worked out in 


the linear response theory (refer back to sections 8.4.7.3, 8.4.7.4). 


It remains to be seen if the approach to equilibrium in the universal 

regime constitutes a Markovian process, i.e., one without a ‘memory’. 

This requires the correlation function C(t) to be of the form C(t) = e™ 

with T real and positive, in which case the relaxation process can, in an 

approximate sense, be described in terms of a differential equation of the 
dAy _ 


form %t = —It. This requires that the envelope function f,4(£,w) be of the 


form with a pole at w = +iI’. This, however, is inimical to the con- 


1 
we4T?? 
vergence of the integral in (10-50b), unless there occurs a suppression 
at values of w larger than kpg7T. An approximately Markovian relaxation 


process results from the theory if [ << kpT. 


In summary, the eigenstate thermalization hypothesis gives a reasonably 
correct account of the relaxation process in the near-equilibrium regime, 
consistent with the results of the linear response theory, and also pro- 
vides us with the theoretical framework to look into the non-universal 
far-from-equilibrium regime. In the near-equilibrium regime, the con- 
ditional probability (P(A;|Ao); this probability is induced by the uniform 
probability distribution of the time of observation, t) of an expectation 
value A, at time ¢(> 0) subject to the initial value Ay is of the form (refer 
back to (10-49)) 


(Ar — C(t).Ao)? 
(dA)? 


P(A;|Ao) « exp [ — O(e~*) iF (10-51) 


where (0A)? denotes the mean squared amplitude of the fluctuations in 
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A (eq. (10-34), eq. (10-45c)), and C(t) stands for the correlation function 
(equations (10-48), (10-50a)) that describes the course of the relaxation 
process, coinciding with the Kubo correlation function under conditions 
stated above. The relaxation is seen, to all intents and purposes, to 
be deterministic (with deviations O(e~°)) and symmetric under time re- 
versal (C(t) is an even function of ¢), with correlations decaying to zero 
if the bandwidth W of the envelope function f4(F,w) is finite though, 
in principle, quasi-periodic resurgences are possible on the scale of the 
Heisenberg time 7%", which is too large to be of relevance. The relation 
A, © C(t)Apo in the near-equilibrium regime, which constitutes an effec- 
tive way of interpreting (10-49), can be looked upon as a statement of the 
fluctuation-dissipation theorem in the present setting. In other words, 
ETH leads to results consistent with the Boltzmann-Gibbs framework of 


statistical mechanics. The evolution towards the equilibrium is approxi- 


mately a Markovian one, with non-Markovian memory effects on a time 


scale less than 4... 
B 


10.3.5 ETH and thermodynamics 


The content of this chapter is based on [26] and [116]. 


The assumption of quantum chaos and that of ETH are generally suffi- 
cient to explain the phenomenon of thermalization, i.e., of the fact that 
isolated systems, left to themselves, eventually attain a state of equilib- 
rium where the statistical features of measured values of observables are 


described by the equililibrium ensembles of statistical mechanics. 
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For the system as a whole, it is the microcanonical ensemble that is 
of relevance. On the other hand, the state of a sufficiently small sub- 
system is described by the canonical ensemble (refer to sec. 10.4.2.2) 
at the temperature corresponding to the internal energy of the whole 
system as given by the microcanonical ensemble, along with other pa- 
rameters like the number of particles and the volume. Considering a 
system in interaction with a large heat bath, the former can be looked 
upon as a subsystem of the composite system, where the tempera- 
ture of the bath determines the state of the (sub)system described by 


the canonical ensemble. 


Put differently, the ETH (which, in a sense, presupposes quantum chaos) 
is needed to establish the thermalization of an isolated system, taken 
far from equilibrium. In the present section, we will see how the same 
assumptions of quantum chaos and ETH lead us to the standard ther- 
modynamic principles and the non-equilibrium fluctuation relations in 
quantum mechanical terms. We recall the basic idea explaining equili- 
bration: when a system is left to itself, measured values of local observ- 
ables approach their expectation values in the diagonal ensemble (see 


below) resulting from their initial states. 


The phenomenon of thermalization has been addressed in sections 10.3.1 


and 10.3.4. Sec. 10.4 below will contain a brief overview of equilibriation 


and thermalization of quantum systems. 


1494 


CHAPTER 10. QUANTUM CHAOS AND FOUNDATIONS OF QUANTUM 
STATISTICAL MECHANICS 


10.3.5.1 Doubly stochastic processes 


In the attempt to establish the thermodynamic principles, one makes use 
of the theoretical framework relating to doubly stochastic processes. An 
equilibrium state is described by a density matrix that is diagonal in the 
energy basis, and constitutes an instance of the more general class of 
diagonal ensembles (refer back to sec. 10.2.8) where, generally speaking, 
a state represented by a diagonal ensemble is a stationary one when the 
system under consideration evolves in isolation. As it is made to interact 
with other systems around it, its state departs from being represented 
by a diagonal ensemble but, once the interaction ceases, there eventually 


occur a dephasing and a diagonal ensemble appears once again. 


In the following, we consider a thermodynamic process in a thermally in- 
sulated, i.e., a closed system in which its Hamiltonian depends on one or 
more parameters (collectively denoted by ¢) that can be made to vary with 
time in accordance with some specified protocol, where the variation need 
not be quasi-static. Such a process is, generally speaking, an irreversible 


adiabatic one. 


Recall the empirical fact that any two equilibrium states of a system 
can be connected by an adiabatic process,possibly irreversible [15] 
(i.e., one that takes the system from one state to the other, though an 


adiabatic process in the opposite direction may not exist). 


We assume that a process is initiated in a system with an initial Hamilto- 


nian H from a stationary state represented by the diagonal ensemble (we 
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assume the eigenvalues of H to be referred to in some given sequence, 


say from the lowest energy to successively higher ones), of the form 


(= > pmnln) (nl. (10-52a) 


Likewise, the final stationary state will be taken to be of the form 
P= >— Pil) (m, (10-52b) 


pertaining to some modified Hamiltonian, say, H , where the eigenvalues 
and the corresponding eigenstates are again assumed to be taken in some 
specified sequence (commonly, from the lowest energy upwards). Thus, 
the indices such as n and m are used to refer to the initial and final sets 
of energy eigenvalues and eigenstates, both taken in pre-assigned orders. 
The ordering of the two sets will be assumed to be so chosen that, in the 
case of a cyclic process where H and H are the same, pairs such as n,n, 


m,m become identical. 


In between ¢t = 0 and ¢t = 7 (the initial and final times), the evolution is 
determined by a time-dependent Hamiltonian, depending on the protocol 
by which the parameter ¢ is made to change (recall that, more generally, 
€ may stand for more than one parameters), and the transformation from 
the initial to the final state is described by a unitary evolution operator 


(see below where generalizations are allowed) ‘ag 


p=Uput. (10-53) 
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Making use of (10-52a), (10-52b), one obtains 


Pram = YT 12) PP nn 


— beni (say), (10-54a) 


n 


connecting the initial and final diagonal density matrix elements, which 
may be taken to represent the occupation probabilities of eigenstates 
such as |n),|m) in the initial and final diagonal ensembles. The expres- 
sions such as ppm = |(m|U|n)|? can then be interpreted as the transition 
probabilities from initial to final eigenstates (|n) to |m)), where these can 


be taken to be the matrix elements of a transition matrix P, 
Pam = Prom (= |(m|O|n)|?). (10-54b) 


The unitarity of the evolution operator U implies the following relations 


characterizing the elements of the matrix P, 
S > Pam = 1, dL Pam =, (10-55a) 
where all of these satisfy 
08 Pam = 1 (10-55b) 


These features identify P as a doubly stochastic matrix, and the corre- 
sponding time evolution as a doubly stochastic process. Denoting the 


initial and final occupation probabilities (pn, Pmm) DY Pn, Pm for the sake of 
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brevity, we observe that (10-54a) is in the form of a master equation 


Master equations based on doubly stochastic matrices have a number of 
interesting properties. Since P is, in general, not symmetric, it possesses 
distinct sets of left and right eigenvectors, though the two sets corre- 
spond to one single set of eigenvalues. The eigenvalue \ = 1 is always 
present in the spectrum, corresponding to which the existence of the left 
eigenvector (1 1 --- 1) is proved trivially (as ensured by the first relation 
in (10-55a)). The right eigenvector (which we denote by ();,2,--- ,)"), on 
the other hand, corresponds to the asymptotic state the system tends to 
under repeated applications of (10-56), and thus represents the asymp- 
totic stationary state resulting from a repeated application of the doubly 


stochastic process. 


This eigenvector is unique if the transition matrix P does not have a block 


diagonal form. 


The course of approach to the asymptotic state is determined by the 
eigenvalues other than = 1, which are all less than unity, provided 


the stationary state in question is unique. 


While a unitary time evolution describes a doubly stochastic process, the 
latter defines a broader category of processes than unitary ones, though it 
too is constrained by the relations (10-55a) where, in particular, the first 


of the two relations represents a non-trivial constraint that the process 
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is required to satisfy. However, the class of doubly stochastic processes, 
which features the important property that the product of any two doubly 
stochastic matrices is also doubly stochastic, provides the appropriate 
framework for addressing the issue of why and how a system comes to 
abide by the standard thermodynamic principles. Thus, other than a 
unitary evolution, the following processes are included in the category 
of doubly stochastic ones: (a) projective measurements, which violate 
unitarity, (b) a statistical mixture of several doubly stochastic processes. 
The latter class, in turn, includes various dephasing mechanisms such 
as the action of external noise or the application of external pulses, with 


fluctuating waiting periods between successive pulses. 


1. Processes satisfying the the detailed balance principle P,,,;, = Pj» con- 


stitute a special case of the doubly stochastic ones. 


2. A projective measurement can be represented as a quench followed by 


a dephasing process. 


10.3.5.2 The reversing process 


The transpose of a doubly stochastic matrix P is also a doubly stochastic 
one. As compared to the process represented by ?, which we refer to 
as the forward process in the present context, the one represented by P*™ 
corresponds to an interchange of the initial and final diagonal ensembles, 
and can be referred to as the reversed process, where the forward process 


is enacted in reverse sequence. 


To begin with, we consider the special case of a unitary evolution (i.e., 


a process with no intervention of projective measurements or stages of 


1499 


CHAPTER 10. QUANTUM CHAOS AND FOUNDATIONS OF QUANTUM 
STATISTICAL MECHANICS 


dephasing) with evolution operator U (7) (recall, from (10-53), that this 
effects the transformation from the initial to the final ensemble). In this 
case, the time-reversed process corresponds to Ut which implies that the 
doubly stochastic matrix representing the reversed process is indeed PT 
(check this out). In this case of a unitary time-evolution from ¢ = 0 to 
t = 7, the reversing protocol corresponds to a Hamiltonian with time- 


dependence of the form H"(t) = H(r —t). 


More generally, the reversed process involves a reversing protocol of the 
form mentioned above, together with projective measurements and de- 
phasing events effected in the reverse order as compared to the corre- 
sponding events in the forward process, and the net result is represented 
by the doubly stochastic matrix P' which does not necessarily corre- 


spond to a unitary evolution operator. 


In the special case of a time symmetric protocol satisfying H(t) = H(r — 1), 
one has P = P', which implies the detailed balance relation py» = Pr+m. 


This corresponds to a cyclic process since H(r) = H(0). 


10.3.5.3 Doubly stochastic processes: general features 


We continue to focus on doubly stochastic processes in a system between 
stationary states represented by diagonal ensembles, where such pro- 
cesses include irreversible adiabatic ones and are of general relevance 
in describing thermodynamic transformations between equilibrium and 
non-equilibrium states of the system. In this section we state some of 


the more important features of doubly stochastic processes pertaining to 
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thermodynamic behavior. In keeping with the general approach adopted 
in this book, I will omit the proofs of most of the statements, instead 
pointing out their relevance in the overall conceptual framework. We re- 
call that a doubly stochastic process is associated with a doubly stochas- 
tic matrix. Since we consider finite dimensional systems for the sake of 
conceptual simplicity, such a matrix will be assumed to be of a finite 


dimension D. 
A. Increase of the diagonal entropy. 


The diagonal entropy Sp cannot decrease in a doubly stochastic process, 
and can only approach the maximum possible value for a system with a 
specified dimension D, i.e., kg ln D. One can define a ‘distance’ between 
two diagonal ensembles ((/, ) both corresponding to the same value of D), 
such as the ones in (10-52a), (10-52b), in terms of the so-called Kullback- 


Leibler divergence (commonly referred to as ‘relative entropy), as 


D 
n=1 


mn 


D 
=~ pp ln (22), (10-57) 
nal Pn 
where the notation has already been explained. 


The maximum possible value of the diagonal entropy for dimension D is 
So = kg ln D, corresponding to what is referred to as the infinite tempera- 
ture state since it is formally the same as the Gibbs canonical ensemble 
at temperature T = oo in which all the energy eigenstates (and not just 


those belonging to a narrow energy window) are occupied with the same 


1501 


CHAPTER 10. QUANTUM CHAOS AND FOUNDATIONS OF QUANTUM 
STATISTICAL MECHANICS 


probability. As a doubly stochastic process takes the system from the 
diagonal ensemble / to /, its distance from the maximum entropy state 
decreases (or remains unchanged, when the initial and final stationary 


states both correspond to the same Hamiltonian), which implies that 


oy = on ox (10-58) 


where, in keeping with the rest of the notation in the present chapter, Sp 
is the diagonal entropy of the final state in a doubly stochastic process. It 
may be mentioned that this result holds regardless of whether the system 
under consideration is integrable or not. However, it presupposes that 
the initial and final states are represented by diagonal ensembles since 


otherwise the process considered need not be a doubly stochastic one. 


The increase of diagonal entropy under a doubly stochastic evolution is 
to be contrasted with the von Neumann entropy, which remains constant 
under any unitary evolution. In this context, it may be noted that the 
two entropies coincide in the case of a diagonal ensemble. Considering a 
unitary evolution from a stationary initial state, the final von Neumann 
entropy, which is the same as the initial diagonal entropy, has to be less 
than or equal to the final diagonal entropy, since the final state need not 


correspond to a diagonal ensemble. 


It is important to note that the diagonal entropy (Sp) (eq. (10-24)) is de- 
fined not just for a diagonal ensemble but for any density matrix whatso- 
ever, though only the diagonal elements of the latter in the energy basis 


are relevant in determining Sp. Moreover, if the initial state is station- 
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ary then, in any time-dependent process in a closed Hamiltonian system, 
we have Sp(t) > Sp(0), where the final state need not be a stationary 
one [115]. In such a process, the inequality Sp(t) > Sy(t) is satisfied at 
all times t > 0, where Sy stands for the von Neumann entropy, which 


remains unchanged. 


It is also of relevance to note that if a system, initiated in a stationary 
state, is subjected repeatedly to successive doubly stochastic processes, 
then it tends asymptotically to the infinite temperature state, which ap- 
pears as the unique attractor of the stationary states appearing in the 
successive stages. Assuming that the successive doubly stochastic pro- 
cesses are all identical, each represented by the doubly stochastic matrix 
P, the course of approach to the infinite temperature state is determined 


by the eigenvalues of P less than unity. 


B. The Kelvin formulation of the second law. 


We consider a diagonal ensemble in which the occupation probability 
(pb, = Pnn in our present notation) decreases monotonically with energy, 


i.e., for any two indices n,m 


A diagonal density matrix satisfying such a relation is referred to as a 
passive one, of which a Gibbs canonical ensemble constitute a particular 
instance. This property of a passive density matrix implies the following 


important result: in a doubly stochastic cyclic process, i.e., one in which 
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the initial and final Hamiltonians are identical (thereby characterized by 


the same energy spectrum), the mean energy cannot decrease, i.e., 
> EalPan = Pan) > 0, (10-60) 


where /, denote the initial and final diagonal density matrices, of which 
the former is assumed to be a passive one. The proof of this statement 
makes use of the fact that the diagonal entropy tends to increase in a dou- 
bly stochastic process, making the occupation probabilities more evenly 


distributed among the different energies. 


By energy conservation, the above expression represents the expectation 
value of the work performed on the system (recall that, under our earlier 
assumption, a doubly stochastic process corresponds to an adiabatic one, 


possibly irreversible), 
(W) > 0. (10-61) 


In other words energy in the form of work cannot be extracted from a sys- 
tem in a cyclic process, without heat being given to it, which is the Kelvin 
formulation of the second law of thermodynamics. In the present context, 
this law has been arrived at by considering doubly stochastic processes 
under conditions more general than ones corresponding to initial and 


final states described by canonical ensembles. 


10.3.5.4 The fundamental thermodynamic relation 


A. Summary of basic notions. 
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The interpretation of the basic thermodynamic principles in microscopic 
terms, using the equilibrium ensembles of statistical mechanics has been 
set out in [2], chapter 5 (see also [108]; for a brief outline, refer back to 


sec. 2.3.2). 


In order to provide an interpretation to thermodynamic principles, one 
has to start from the basic concepts of a macroscopic state, thermo- 
dynamic values of observables in a state, internal energy, temperature, 


heat, work, and entropy. 


Briefly, a macroscopic state is described by a density matrix () that de- 
termines the expectation values of relevant observables characterizing the 
system under consideration (the density matrix, at the same time, pro- 
vides the link to the microscopic description). The thermodynamic value 
of an observable A is represented by the trace Tr(Af). An equilibrium 
state corresponds to a density matrix such that the values of all the rele- 
vant observables are independent of time (and, additionally, allthe fluxes 


and affinities are zero). 


The internal energy (U; this is not to be confused with the evolution op- 


erator denoted by U) of the system in a state / is taken to be given by 


U = Tr(H), (10-62) 


where H stands for the Hamiltonian operator that determines the mi- 
croscopic dynamics of the system. The thermodynamic entropy (5) for an 


equilibrium state is represented by the von Neumann entropy —kgTr(/ In p) 
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(also termed the ‘statistical entropy’ since it pertains to the statistical loss 
of information in the macroscopic description in terms of a density ma- 
trix). The issue of assigning an entropy to a non-equilibrium state will be 


briefly addressed below. 


Heat and work are not state functions, and depend on the process that 
the system is subjected to. Processes can be of various types. In an 
isolated system, where there is no interaction with the external world, 
the evolution is described by a unitary operator U determined by the time- 
independent Hamiltonian 1. For a closed system (i.e., one not in thermal 
contact with external systems) under the influence of an external field or 
of some external system that can perform work on it, one has, in general, 
a time dependent unitary operator U(t). The time dependence commonly 
enters through one or more parameters (f(t), which we assume to be 
classical ones for the sake of simplicity) in the Hamiltonian. The process 
under consideration depends on the ‘protocol’ according to which the 
parameters are made to change. An infinitesimally slow time-evolution, 
which corresponds to an idealized process, represents a quasi-static one 
that describes a reversible adiabatic change in which energy is supplied to 
the system in the form of work, but the thermodynamic entropy remains 
unchanged. On the other hand, a protocol in which £(t) is made to change 
at some finite rate (which includes a rapid variation that may be almost 
instantaneous) corresponds to an irreversible adiabatic process in which 


the entropy increases . 


In the literature, the term ‘adiabatic’ is at times used to refer to a 
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reversible adiabatic process. More generally, however, the term is 
meant to denote a process, reversible or irreversible, where there 
occurs no exchange of heat. The term ‘adiabatic’ is used in quan- 
tum mechanical perturbation theory in the sense of a slow process 
in which higher order terms in the perturbation series become in- 
significant in comparison with the leading approximation. However, 
such a process has to be incomparably slow when referred to the 
time scale of a reversible adiabatic process as considered in thermo- 


dynamics [2]. 


In contrast, an open system is one where energy is supplied in the form 
of heat (in addition to the supply of energy in the form of work) in virtue 
of a temperature differential (more generally, an exchange of matter may 


also take place, but such processes will not be explicitly referred to here). 


In microscopic terms, a process of heat exchange by an infinitesimal 
amount corresponds to a change in the density matrix without any change 


in the Hamiltonian and the associated energy levels [2]: 
6Q = Tr(dpH). (10-63a) 


On the other hand the infinitesimal amount of work performed on the 
system in an adiabatic process corresponding to a small change in the 


relevant parameters is of the form 


bW=-S) Xa, (10-63b) 
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where the index a ranges over the set of control parameters {£,} on which 
the Hamiltonian (H) depends (the negative sign on the right hand side is 
a matter of convention). In general, these depend on the time t, as a 
result of which the evolution operator U becomes time-dependent. The 


generalized forces, X, are then given by 


A 


OH 


Xo = (ae 


hs (10-63c) 


where the averaging is with respect to the (appropriate) density matrix, 
possibly a non-equilibrium one (see below). These are assumed to repre- 


sent the thermodynamic forces conjugate to the ‘displacements’ €,. 


Refer back to sections 8.2.2, 8.5. Observe the change of notation where 
the parameters , of the present section are denoted by X; (k = 1,2,---) in 
these earlier sections (where the set of X;,’s include, along with the ‘displace- 
ment’ parameters ¢,, the internal energy U as well), while the generalized 
forces, denoted by X,, in the present section, are obtained from correspond- 
ing affinities (A;) of the earlier sections by multiplying with T. These are 


conjugate to the set of parameters £, that does not include the internal 


energy. 


In terms of the density matrix and the change in the Hamiltonian, the 
expression for an infinitesimal amount of work performed on the system 


under consideration appears in the form 


OW = Tr(foH), (10-64) 
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where 5H stands for the change in the Hamiltonian resulting from the 


displacements 6£,, causing a shift in the energy eigenvalues. 


As mentioned above, work can be performed on a closed system (i.e., one 
that is thermally insulated from external systems) by means of a protocol 
where the displacement parameters €, are made to vary at finite rates 
(where the variations may possibly be rapid ones). Such an irreversible 
adiabatic process can be taken into account by adopting an appropri- 
ate interpretation of entropy in terms of the density matrix (see below) 
and by relating 5€, in (10-63b) to the rates of variation of the control pa- 
rameters, given the duration dt of the process in question (refer back to 
section 9.14.1 where an analogous approach was adopted in the classical 


context). 


The formulae (10-62), (10-63a), and (10-64) are consistent with the dif- 


ferential form of the first law of thermodynamics, 


65U = 6Q + 6W. (10-65) 


However, eq. (10-65) is more general in scope since it applies to the case 
of an irreversible process as well. In this case, entropy is produced as a 
consequence of work performed irreversibly on the system. The concept 
of entropy relates to irreversibility and the second law of thermodynam- 
ics, whose differential expression constitutes the fundamental thermody- 


namic relation for a system. 


B. Entropy and the fundamental thermodynamic relation 
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If we consider two equilibrium states close to one another representing 
the terminal points of a small segment of a reversible thermodynamic 
process (as represented by a path in the thermodynamic state space) 


then these satisfy the following fundamental thermodynamic relation 
5U =T5S — N° Xab€q. (10-66a) 


In the following, we will assume that the set of parameters ¢, includes 
just one single parameter € (such as the volume V in the case of a simple 
fluid, in which case the entropy is a unique function of U,V) for the sake 


of simplicity, when the fundamental relation appears in the form 
6U =T6S — XE. (10-66b) 


(the generalization to a system with multiple parameters ¢, is straightfor- 
ward). Here S stands for the thermodynamic entropy of an equilibrium 
state represented, in microscopic terms, by the von Neumann entropy of 
the equilibrium density matrix. The thermodynamic temperature T has 
the following interpretation: for a family of equilibrium states correspond- 


ing to a fixed value of €, one has 


6S 


Yager ee 
6U 


(€ = constant), (10-66c) 


while, for an equilibrium state of a system weakly interacting with a heat 


bath (the weak interaction describes a thermal contact), it is related to 
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the Lagrange multiplier 5 determining the condition of equilibrium as 


T= (hep): (10-66d) 


With the equilibrium state described either by the microcanonical or the 
canonical ensemble, its von Neumann entropy coincides with the diago- 
nal entropy. However, the former remains unchanged in a unitary evo- 
lution that may, in general, be a time-dependent one ([2], chapter 3) and 
thus cannot represent the thermodynamic entropy for all states. In par- 
ticular, the fundamental formula (10-66b) does not hold for all thermody- 
namic processes with the von Neumann entropy standing for S since, for 
instance, both the terms on the right hand side are zero in a cyclic pro- 
cess involving irreversible work performed on a closed system (i.e., one in 
which there is no transfer of heat, and the dispalcement parameter(s) € 
is brought back to its initial value after being made to change at a finite 
rate), while the internal energy increases by the amount of work done 


(refer back to the Kelvin formulation of the second law outlined above). 


In other words, the question arises as to how to represent in statistical 
terms the thermodynamic entropy at any intermediate point of an irre- 
versible process, i.e., for a non-equilibrium density matrix p. More specif- 
ically, with 5U = 5Tr(/H) (refer to eq. (10-62)), and with 5W representing a 
small amount of irreversible work performed on the system, what should 
be the expression of the entropy S in terms of p, where the expression 
should define a unique function of / and H? Moreover, it has to reduce 


to the von Neumann entropy when / stands for an equilibrium ensemble. 
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It turns out that the diagonal entropy fits the bill [26], [116], [115]. This 
requires that the system under consideration be sufficiently large and the 
density matrix (/) in question satisfy certain conditions, namely, that (a) 
it is to be a superposition or a mixture of energy eigenstates belonging to 
a sufficiently narrow energy range and (b) within this energy range, the 
eigenstates actually involved in the making up of p should be sufficiently 
distributed i.e., should not be limited to a small subset (in other words, 
the inverse participation ratio should be sufficiently large). Finally, and 
significantly, quantum chaos appears to be a basic pre-requisite for the 
identification of Sp as the thermodynamic entropy in a non-equilibrium 


state p. 


Specifically, this identification is based on the smoothness property of 
the energy distribution function P(F) characterizing the density matrix, 


where 


P(E) = p(E)Q(B). (10-67) 


In this expression, p(£) stands for an appropriate interpolating function 
satisfying p(E,,) = (6)nn (the diagonal element of 6 corresponding to the 
energy eigenvalue E,,), and 2(F) is a smoothened density of states. In the 
case of a sufficiently large non-integrable system P(E) is expected to be a 
smooth function while for an integrable system it exhibits large fluctua- 
tions. In consequence, the diagonal entropy deviates from the integral of 
Sinicrocanonical( ’) (the microcanonical entropy at energy £) weighted by P(F), 
by at most a sub-extensive term in the case of a non-integrable system, 


while the deviation is, in general, large (i.e., extensive in the system size) 
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in the case of an integrable system. 


The identification of the diagonal entropy as the one satisfying the funda- 
mental thermodynamic relation (10-66a) is supported by numerical evi- 


dence [26], [115]. 


10.3.5.5 The fluctuation relations 


The transient fluctuation relations characterizing non-equilibrium pro- 
cesses in thermodynamic systems were introduced in the classical con- 
text in section 9.14. In the classical derivation of the fluctuation rela- 
tions the chaotic dynamics at the microscopic level does not appear to be 
of fundamental relevance, though it does feature indirectly in identifying 
the class of initial states for the non-equilibrium processes to which the 


fluctuation relations are supposed to apply (refer back to section 9.14.2). 


Since there are numerous variants of transient fluctuation relations, we 
first specify (for the sake of ready reference and for notational clarity) the 
type of processes on which we focus, one pertaining to the Crooks fluctu- 
ation relation examined in section 9.14.3 in the classical context (recall 
that the Jarzynski relation follows from this). We assume that, starting 
from a canonical ensemble A at temperature 7’, the system under consid- 
eration, closed to heat exchange, is made to follow a time-dependent uni- 
tary evolution (or, more generally, a doubly stochastic process) in which 
a macroscopic displacement parameter, denoted by &(t) (we consider only 
one such parameter for the sake of simplicity), undergoes a change in 
accordance with some externally determined protocol for some specified 


time 7. We denote the initial and final values of the control parameter 
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by €4,& . As the value € = &g, is reached, the protocol is terminated, 
where the terminal state is not, in general, an equilibrium one. The work 
done on the system for this ‘forward’ process is obtained as the average of 
fluctuating values corresponding to individual trajectories initiated from 
various microstates compatible with the initial ensemble A, each such 
trajectory being in accordance with the prescribed protocol for time rT. 
The probability of obtaining some specified value W among all these fluc- 
tuating amounts of work will be denoted by P;(W), with the value of 7 left 


implied. 


We compare P;(W) with the probability of work —W pertaining to the 
reversed process (we denote this by Pg(—W)), where the latter starts from 
the value €, of the control parameter, with the system in an initial state 
described by a canonical ensemble at the same temperature T as that of 
the initial state in the forward process. The reversed process is made to 
follow a time-reversed protocol €!"I(t) = ¢l"l(r — t) (the super-indices ‘R’ 
and ‘F’ refer to the reversed and the forward processes) and terminates at 
the value £4. The Crooks fluctuation relation then tells us that the ratio 
of probabilities P»(W) and Pr(—W) for the two non-equilibrium processes 


is given by (refer back to (9-184), where the notation is slightly different) 


Pp(W) 


ee ee ae -1 : 
Pr(-W) ~ (6 = (ke) ), (10-68) 


where AF = Fx — F4 is the difference of free energies of the equilibrium 
states corresponding to values € = g and € = €4 of the control parame- 
ter, the temperature being T for both the states (recall that the terminal 


configuration need not be an equilibrium state corresponding to a well- 
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defined temperature T in either of the forward and reversed processes). 


We now derive (10-68) for the forward and reversed processes as de- 
scribed above, the initial state in either case being an equilibrium Gibbs 
ensemble, where we will see that the derivation does not require the sys- 
tem under consideration to be non-integrable. Subsequently, we relax 
the condition on the initial states, assuming instead that these are repre- 
sented by diagonal ensembles, and arrive at the same fluctuation relation 
by making use of the property of quantum chaos (effectively, the diagonal 


part of ETH). 


The energy eigenstates |n“)) for € = €,, corresponding to eigenvalues EW) 


are distributed in the Gibbs ensemble at temperature T with probabilities 


pi) = 1 --8e (10-69a) 
Za 
where 
Zn = Se PB (8 = (kpT)"). (10-69b) 


n 


A realization of the forward process takes the system from an eigenstate 
In) to one with energy ES’ + W, where W is the work performed on the 


system. The probability P»(W) mentioned above is then given by 
Pr(W) = So Ay sme (Em - EM _ W), (10-70a) 


where the bar over a symbol pertains to the final state (parameter value 
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£,) and p\"),,, denotes the transition probability from initial state |n“)) to 
final state m) under the doubly stochastic forward process (refer back 
to sec. 10.3.5.1). Note that the sub- or super-indices A,B refer to values 


€4,€3 of the control parameter. 


Likewise, the probability Pa(—W) pertaining to the reversed process is 


given by 
Pa(-W) = > pe pee ay (EY — BY + W), (10-70b) 


where it may be noted that the symbols with bar overhead are now used, 
for the sake of easy comparison with (10-70a), to refer to the initial state 
in the reversed process and those without bar correspond to the final 
state. The probability pe ) analogous to pi (formula (10-69a), (10-69b)), 


is given by 
(se. (10-7 1a) 
with 
Ge = e Fim (8 = (kyT)~). (10-7 1b) 


We now make use of the relation between the transition matrix elements 
for the forward and reversed processes, the two doubly stochastic matri- 


ces being transpose to each other (refer to sec. 10.3.5.2): 


(F) _ ,(R) 
PA mB) ~ Pre) sn(A? (10-72) 
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and, at the same time, of the delta function in (10-70b) to write it in the 


form 


Z _ 
A)FA — F B A 
Pa(-W) = 2 Ds Fn" MW) coil E —EY Ww), (10-73) 


(check this out). Comparing this with (10-70a), one obtains 


Z 
Pa(—W) = Pe(W)e WE. (10-74a) 
B 


The thermodynamic relations 
Liga (10-74b) 


pertaining to the equilibrium Gibbs states corresponding to € = £4,€ = 


now leads to the Crooks fluctuation relation 


P»(W) — -b(W-AF) 


10-75 
Pa(—W) 


(one observes that the classical and quantum mechanical derivations are 


entirely analogous to each other, as indeed these ought to be). 


We now approach the same problem of relating the probabilities for the 
forward and reversed processes by making use of the fact, as implied by 
the ETH, that individual eigenstates within a narrow energy shell can be 
taken to have very similar structures, each emulating the microcanonical 
ensemble in respect of expectation values. For a sufficiently large system, 


one can assume the distribution of energy levels to be quasi-continuous. 
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Considering the forward transition from a typical energy level E™ to 


EY +W, we express the probability as 


Po(EY + EA +W) = Doric Piles ~ EO aM) 


= sj dEQp(B)p(E, > B)5(E — E, —W), 


(10-76) 


where (z is the density of states for € = £g, and the rest of the notation 
(simplified for the sake of clarity) is self-explanatory. Here we have as- 
sumed that the system conforms to the requirements of quantum chaos 
and have made use of the fact that, in consequence, the transition prob- 
ability ae is a smooth function (p(E,, — E£)) of the final energy EF, 
(following a simplified notation), with small fluctuations. In the case of in- 
tegrable (non-chaotic) systems the transition probability is characterized 
by strong fluctuations in the initial and final energies, and one cannot 
convert the summation (first line of the above equation) to an integration 


(second line). Performing the integration over F, one obtains 
P3( Ey, 2 En + W) = p( En 2 En +W)OB(E, + W). (10-77) 


Likewise, the probability for work —W performed on the system in the 


reversed process starting from the state |m) (for € = s) is seen to be 


(check this out). Comparing the two probabilities and substituting E,, - 


E, Em — E+W (based on smooth energy dependence implied by quantum 
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chaos), one obtains (with a small but self-explanatory change in notation) 


Pp(E, W) _ Op(E as W) - eke [Sa(B+W)-Sa(W)] (10-79) 
PR(E + W,-W) On(E) 


where we have made use of the thermodynamic entropy expressed in 
terms of the density of states of the initial and final configurations (as 
indicated in sec. 10.3.5.4, these can, in turn, be represented as the re- 
spective diagonal entropies). Thus, the ratio of probabilities of doing work 
W in the forward process and —W in the reverse process is simply related 


to the respective densities of states in the case of non-integrable systems. 


The above result, which constitutes a general form of fluctuation rela- 
tions, can now be seen to lead to the Crooks fluctuation relation for a 
subsystem of a bigger system in the general case when the coupling be- 
tween the subsystem and the complementary system is not necessarily 
weak (a weak coupling leads to a Gibbs distribution in the subsystem, 
which has already been considered above). Assuming that the number of 
particles (V’) in the subsystem is small compared to the size of the entire 
system (NV), and focusing on the work done on the subsystem, we can 


expand S,(£+W) in W (which is sub-extensive in NV) and obtain 


kg'[Sp(E + W) — Sa(E)] =kg'[Sa(E) — Sa(Z) 4 TE : 


~3(W — AF). (10-80a) 


where we have used the thermodynamic relation 


ks'[Sp(E) — Sa(E)] = kg’ [S(E, 8) — S(E, €4)] = B( Fe — Fa). (10-80b) 
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On substituting in (10-79), one again obtains the Crooks fluctuation re- 
lation (see also [116]), where EF and E + W can be interpreted as the 
internal energies of the initial and final states in the forward process 
that we assume to satisfy the conditions of narrow energy distribution 
and sufficiently large values of the inverse participation ratio, these being 


pre-requisites for ETH to apply. 


It may be noted that, once the probability ratio is expressed in terms 
of the density of states, the Boltzmann entropy formula can be in- 


voked regardless of the type of process the system undergoes. 


As indicated earlier (sec. 9.14.3), the Jarzynski fluctuation relation is ob- 
tained by integration from the Crooks relation. The Jarzynski relation 
constitutes a formula of major relevance in statistical mechanics, char- 
acterizing irreversible processes. As outlined here, the validity of either 
of the two fluctuation relations does not require an initial state described 
by a Gibbs ensemble since one described more generally by a diagonal 


ensemble suffices in the case of a non-integrable system. 


10.4 Equilibration and thermalization: an overview 


In this section we present a brief overview of results on equilibration and 
thermalization of quantum mechanical systems, with particular empha- 
sis on unitary evolution of isolated finite dimensional systems. The eigen- 
state thermalization hypothesis, which provides an explanation of ther- 


malization of quantum systems conforming to quantum chaos, has been 
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outlined in 10.3 where a number of consequences of the hypothesis have 
been indicated. In the present section we briefly review thermalization in 
the larger context of equilibration of quantum mechanical systems, while 


focusing on general features of the two. 


In the following, dynamical considerations on isolated quantum mechan- 
ical systems will be complemented with considerations based on dynam- 


ical typicality. 


A vast literature now exists on the issue of equilibration and thermal- 
ization, resulting from experimental and computational work in the last 
two decades, that shed light on the age-old problem of long-time behav- 
ior of isolated quantum systems where these appear to always seek out 
the state of thermal equilibrium in the long run. The current interest in 
questions relating to unitary evolution of many-body systems has been 
generated primarily by three concomitant factors [56] — (1) enormous im- 
provements in experimental techniques that have ushered in a new era 
in the study of interacting many-body systems, especially those involving 
ultra-cold atoms in optical lattices and trapped ions where an exquisite 
degree of control can be exercised, opening up possibilities hitherto out of 
bounds of experimental research; (2) availability of supercomputers with 
vastly increased computing powers based on massive parallelism, along 
with highly efficacious new computing algorithms; and (3) development 
of new mathematical methods, partly motivated by research in quantum 
information theory, making possible important results in areas such as 
quantum information propagation and entanglement in many-body sys- 


tems, all based, however, on standard quantum mechanics. 
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All this taken together have lighted up a vast area relating to many-body 
interactions in quantum mechanical systems, including the issue of non- 
equilibrium evolution of closed systems, with new insights having been 


made possible in foundations of statistical mechanics. 


10.4.1 Equilibration in interacting many-body systems 


The basic idea in equilibration and thermalization relates to dephasing 
and the dephased state (refer back to sec. 10.2.8). Considering a system 
with a state space of dimension D, we denote its energy eigenvalues by 
E, (n = 1,2,---,D', say), where D’ < D because of possible degeneracy. 
We denote the projector onto the eigenspace of E,, by P,,. Given a state /, 
the dephased state fp is given by 

DI 


po = >— PnpPrn- (10-8 1a) 


n=1 


In the case of a non-degenerate spectrum, /p corresponds to a diagonal 
ensemble while more generally, in the presence of degeneracies, it repre- 
sents a density matrix in the block diagonal form (check this out). The 


correspondence / —> /p is referred to as the ‘dephasing map’, 
Dip => Dp= pp. (10-81b) 


10.4.1.1 Equilibration on average 


Equilibration on average is the process wherein an isolated system, ini- 
tiated in a non-equilibrium state represented by the density matrix / 


approaches the dephased state Dfp at large times in the sense that, for 
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an observable A the long time average (evaluated over a sufficiently large 
interval 7) of (A), = Tr(Af(t)) differs little from (A), = Tr(AD/o) (refer to 
sec. 10.3.1 for analogous statements in the context of thermalization, 
as implied by ETH). Moreover, the separation between /(t) and Dj, also 
becomes small. While we explain these statements in somewhat impre- 
cise and non-technical terms here so as to have easy access to the basic 
principles, we refer to [56] for more precise explanations, with technical 


details. 


The idea of equilibration on average dates back to von Neumann’s work 
relating to the quantum ergodic theorem, and has been further developed 
during recent years. As we see below, the term ‘equilibration on average’ 
may refer to either a system as a whole, especially in regard to expectation 
values of observables, or to reduced states of subsystems. In both these 
respects equilibration is a feature of quite considerable generality charac- 


terizing the non-equilibrium dynamics of systems. 


In the foregoing statements, p(t) stands for the (non-equilibrium) den- 
sity matrix resulting from pp under unitary evolution determined by the 
Hamiltonian of the system. The observable A is supposed to belong to 
a class of ‘relevant’ ones characterizing the system, including those re- 
ferred to as local observables, while the system itself is assumed to be 
defined in terms of a local Hamiltonian, i.e., a sum of terms, each of 
which depends on variables relating to subsystems of various sizes. The 
separation between two states (represented by density matrices) can be 


defined in terms of one among numerous operator norms, a commonly 
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used measure being the trace distance. Alternatively, the measure of 
distinguishability between two states, based on a sufficiently large set of 


POVWM’s (positive operator valued measurements) is used. 


Analogous to what was indicated in sec. 10.3.1 in the context of thermal- 


ization, equilibration not only implies that (A), — (A)p remains small at 


large times (7), but that the temporal variance ((A); — (A)p)? also remains 


small. 


When considered against the unitary nature of the time evolution of the 
non-equilibrium state f(t), equilibration appears as a phenomenon of 
surprising generality for interacting many-body systems. Of course, there 
cannot be a permanent convergence between (A); and (A)p at large times 
since the unitary evolution is recurrent in nature. The conditions under 
which equilibration occurs, broadly belong to two categories — one relating 
to requirements on the initial state and the other to the non-degeneracy 


of energy gaps (refer, once again, to [56] for details). 


The initial state is required to be a superposition or a mixture of a suffi- 
ciently large number of energy eigenstates of the system under consider- 


ation. Defining the occupation p,, of the energy level E,, as p, = Tr(P,) (n = 


1,2,--- ,D), one can refer to an ‘effective dimension’ 
deg = d (10-82a) 
a : 
ae Pr 


of the initial state, i.e., the inverse participation ratio of the energy levels 


in it, which is required to be large. One also needs the second largest 
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of the occupation numbers, which we denote by p’, to be small (see [56]) 
— a requirement that rules out superpositions of only a few eigenstates 
that show coherent oscillations at large times instead of equilibration 


dynamics. More precisely, the smaller of the two numbers, 


: an) (10-82b) 


g({P1,P2,*** Po}) = min(5 ; 


is found to be of relevance. For a system with sufficiently large value of 
D, ‘most’ pure states correspond to small values of g. This can be stated 
more precisely in terms of the phenomenon of measure concentration: 
pure states drawn uniformly from a subspace of large dimension have a 
very high probability of being characterized by large values of d.g; in con- 
sequence, expectation values in the pure states have a high probability 


of being close to microcanonical values. 


The other pre-requisite for equilibration under unitary dynamics is the 
non-degeneracy of energy gaps. This means that an energy gap E,, — 


Em (n#m;n,m = 1,2,---D’) is not repeated for any other pair n’,m’. 


In the case of a non-degenerate spectrum, this reduces to the non-resonance 
condition mentioned in sec. 10.3 (and pertains to the situation when the 


system equilibrates to a diagonal ensemble) 


More generally, the condition of non-degenerate energy gaps can be re- 
laxed somewhat so as to admit degeneracies among the gaps, provided 


the number of such degenerate gaps remains small compared to the total 
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number of gaps. 


While the two conditions mentioned above are of quite general nature, 
i.e., equilibration is indeed a general feature of systems of large size that 
appears in spite of the unitary dynamics characterizing a closed system, 
there are only few results relating to the time scale of equilibration which, 
moreover, are too large (exponential in the system size) to be of physical 


relevance. 


It may be mentioned here that, under the above conditions, the reduced 
state of a subsystem of the system under consideration also equilibrates 
on the average to a state obtained by the reduction of the dephased state 
pp resulting from the initial state 9 under the action of the dephasing 
map. Considering a sufficiently small subsystem, say, ‘A’, of the system 
under consideration (‘S’)), and denoting the complementary system by ‘B’, 


one has 


Trp/po ae D(Trpfo) = Trppp, (10-83) 


(equilibration) 


where Trg denotes the partial trace with respect to ‘B’ and D is meant 
to denote the dephasing map operating on density matrices pertaining 
to the subsystem ‘A’. In other words, equilibration on the average of the 
total system ‘S’ implies equilibration on the average of the subsystem 
‘A’, where it is necessary to consider subsystems of size less than half 
the size of ‘S’ since, in the case of a subsystem of a relatively large size, 
fluctuations around the dephased state become dominant. We refer to 


this phenomenon of dephasing of the reduced state of a subsystem as 
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subsystem equilibration (on the average). 


Evidently, the equilibrated state fp (described by a block diagonal or a 
diagonal ensemble) depends on the initial state fp or, in other words, re- 
tains ‘memory’ of the latter. If the initial state involves eigenstates over 
a wide range of energy eigenvalues, the equilibrated state is made up 
of a similarly wide range of eigenstates. If the energy eigenvalues are 
non-degenerate (or degeneracies, if present, are relatively rare) and if the 
system conforms to the requirements of quantum chaos then the final 
state resembles (on the average) a thermal one within any chosen nar- 
row energy window (one additionally requires an assumption of energy 
distribution within the narrow window), but we will come to this later. 
What is more, this ‘memory effect’ retained in the dephased state, to 
which the system equilibrates on the average, appears to work differently 
for integrable and non-integrable systems. This is further discussed in 


sec. 10.4.1.4 below. 


10.4.1.2 Equilibration during intervals 


The term equilibration during intervals describes a process in which some 
property of an interacting system remains close to an equilibrium value 
at all times during a certain time interval, departing from the latter at 
earlier and later times. While equilibration on average is provably a very 
generic feature for large systems, equilibration during intervals has been 
rigorously established for only a restricted class of systems and states, 
though it too appears to occur widely in non-equilibrium dynamics of 


quantum systems. What is more, the feature of equilibration on average 
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is commonly related to that of thermalization, but no such connection is 
easily discernable in the case of equilibration during intervals. Indeed, in 
the few cases in which equilibration during intervals of reduced states of 
subsystems has been established, the latter are found not to lie close to 
thermal states of suitably chosen Hamiltonians. On the other hand, time 
scales of significant relevance can be established (scaling reasonably with 
the system size) in the case of equilibration during intervals, in contrast 


to equilibration on average. 


Equilibration during intervals has been rigorously established in the case 
of initial states, referred to as the Gaussian ones, for a class of quadratic 
bosonic Hamiltonians, while non-Gaussian initial states have also been 
considered in certain cases. In such cases, it has been proved(refer 
to [56]) that the reduced state of an arbitrarily chosen subsystem remains 
close (in some precisely defined sense) to the reduced state obtained from 
a Gaussian state of the whole system during an interval between times 
71, T2 (the relaxation time and the recurrence time; see [149]), provided the 


system size is larger than a certain minimum. 


The time 7, depends on the size of the subsystem in question, though 
it does not depend on the size of the whole system. It is related to the 
speed at which correlations are transported through the system and the 
length scale of decay of correlations in the initial state. The time 7m is a 
lower bound on the recurrence time and is related to the time it takes for 


a signal to travel through the whole system. 
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10.4.1.3 Lieb-Robinson bounds 


The Lieb-Robinson bounds provide an important tool in the estimation of 
time scales in the non-equilibrium dynamics of interacting many-body 
systems described by local Hamiltonians. Imagine a non-equilibrium 
state of an interacting lattice system made up of units located at the 
lattice sites where the interaction is local in nature. The time-evolution of 
the system is determined by the way the state of a localized set of units 
sends out ‘information’ to neighboring regions of the lattice by means of 
interaction that causes the states of these neighboring units to change. 
In this way the state of the entire system gets changed in a process analo- 
gous to spin waves on a lattice with nearest neighbor interactions among 
the spins. The bounds set a limit to the spread of information to within a 
‘cone’ (at times referred to as a ‘light cone’), outside of which the correla- 
tion between the units is suppressed exponentially. The ‘units’ in ques- 
tion may be bosons or fermions as the case may be, and are commonly 


referred to as ‘spins’. 


Given two operators A, B with localized support (which implies that these 
depend on the variables of two localized sets of spins, with extensions, 


say, 


Al, |B 


), The bounds are expressed mathematically in the following 


form 


| [A@®, B] |< ¢ | A [Ill B || min exp [-u(a(A, B) — v|A\)]. (10-84) 


In this expression, where no relativistic effects are considered, || - || 


stands for the operator norm of an observable, and d(A, B) for the separa- 


1529 


CHAPTER 10. QUANTUM CHAOS AND FOUNDATIONS OF QUANTUM 
STATISTICAL MECHANICS 


tion between the supports of A,B, while v(>) plays the role of a group 
velocity and c,j: are positive constants. One observes that the time- 
dependent correlation between A, B(t) decays exponentially outside the 


cone d(A, B) = v|t 


, while inside the cone, the correlation can grow signifi- 


cantly. 


The idea that local dynamics spreads in the form an excitation through 
the lattice is clearly relevant in the context of non-equilibrium time- 
evolution of the system. Such ‘light cone dynamics’ has been observed in 
optical lattice systems and in continuous systems involving cold atoms. 
The propagation of excitation generates entanglement in the system that 
cannot be simulated in classical systems. Modified bounds analogous to 
the one mentioned above are still possible for systems with interactions 


that deviate from strictly local ones. 


However, as mentioned above, few, if any, significant results are avail- 
able on the time scales of equilibration on average, in spite of important 
insights having been provided by the Lieb-Robinson bounds regarding 


non-equilibrium dynamics in locally interacting systems. 


10.4.1.4 The maximum entropy principle and the generalized Gibbs 


ensemble 


We have already seen that in the case of equilibration on average, the 
equilibrium expectation value of an observable or the equilibrium reduced 
state of a subsystem approximates the expectation value in the dephased 
state (6p = Dfo, refer back to sec. 10.4.1.1) or the reduced state obtained 


from the dephased state, as the case may be. In other words, the state fp 
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encodes the equilibrium properties of the system. It turns out that this 
state is one that satisfies a maximum entropy principle, subject to given 


values of all the relevant constants of motion. 


What is more, it happens that in a large number of situations (includ- 
ing ones resulting from quenches in complex interacting systems), the 
equilibrium expectation values of large classes of observables are well ap- 
proximated by those in states that maximise the entropy subject to given 
values of much smaller sets (see below) of constants of motion. Such a 


dephased state is referred to as a generalized Gibbs ensemble (GGE). 


The principle of maximum entropy can be stated as follows (see [56]): 
supposing that the expectation value of an operator A equilibrates on 
average, the equilibrium value of A, = TR(Af(t)) (where A(t) stands for 
the evolving non-equilibrium density matrix), is given by TR(A®) where w 
is the unique quantum state that maximises the von Neumann entropy, 
given the values of all the constants of motion as determined by the initial 


state fo. 


Here , introduced for notational simplicity, is nothing but the dephased 


state fp, i.e., the time-averaged state obtained from /(t). 


This result is based on the observations that the von Neumann entropy is 
non-decreasing under the dephasing map and, moreover, two states /j, po 
give the same values for all conserved quantities if and only if they lead to 
the same dephased state (refer to [56] for uniqueness of &). It is notewor- 


thy that the maximum entropy principle follows from unitary dynamics, 
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along with dephasing. However, the predictive value of the maximum 
entropy principle, stated in the above form, is rather limited since the 
number of all linearly independent conserved quantities (in the space of 


observables) increases exponentially with the number of constituents. 


Indeed, all the projection operators P,, (n = 1,2,--- ,D) on to the en- 


ergy eigenspaces are constants of motion. 


The question then arises as to whether it is possible to find a small num- 
ber of constants of motion in terms of which the maximum entropy state 
w can be expressed. What is relevant in this context is the set of inde- 
pendent local constants of motion, in terms of which one can distinguish 
between integrable and non-integrable systems since the two differ in 


respect of the maximum entropy states that result from the dephasing. 


In the case of integrable systems, the number of such local conserved 

quantities is generally of the order of the number of constituents (J), i-e., 

far less than the dimension of the Hilbert space (D) or the number of 

independent projection operators 2 Ce (k = 1,2,--- ,M, say), be such 

a set of conserved operators (an appropriately chosen minimal one), then 

one can express w in the form of a generalized Gibbs ensemble (GGE) as 
em Dka1 BAR 


a= ay (10-85) 
Tr(e7 Le=1 BeAr) 


In the case of an initial pure state |y)) = Dialer C,,|n), the maximum entropy 


state can be written in the form @ « exp [— pee BnPn|; with 8, = —In|C,,|?. 
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However, this does not represent a GGE, since the projectors do not consti- 


tute a minimal set of local observables. 


In general, the GGE obtained from an initial state 6, depends on the 
latter (i.e., retains a ‘memory’ of the initial state) through the multipliers 
by, (k = 1,2,--- ,M) which implies that the equilibration on average is not 
necessarily the same as thermalization, this, generally speaking, being 
the situation for an integrable system (however, the reduced state of a 


subsystem may well be a thermal one). 


10.4.2 Thermalization: summary and overview 
10.4.2.1 Opening remarks 


Thermalization is the process in which an isolated system (‘S’), initiated 
in a state fy effectively approaches a dephased state pp representing a 
microcanonical ensemble under a certain appropriate set of conditions. 
Alternatively, it may be described as a process in which the reduced state 
of a subsystem (‘A’) approaches a Gibbs ensemble. Here the term Gibbs 
ensemble refers to a canonical one, assuming that there is no exchange 
of particles between the subsystem ‘A’ and the complementary subsystem 


(‘B’) of ‘S’. 


The system ‘S’ is said to effectively approach the thermal, i.e., the 
microcanonical state in the sense that expectation values of observ- 
ables (belonging to a certain class of ‘relevant’ ones) approach their 


microcanonical averages, as outlined in sec. 10.3.1. 
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In other words, thermalization may refer to either the system ‘S’ under 
consideration or subsystems (such as ‘A’ above); the term ‘thermal state’ 
will mean one of the equilibrium ensembles of quantum statistical me- 
chanics (the microcanonical or the canonical ensemble as the case may 
be; in the present chapter we will not have occasion to refer to the other 


equilibrium ensembles mentioned in chapter 2.) 


Thermalization of the whole system (‘S’)) is fruitfully discussed within the 
framework provided by the eigenstate thermalization hypothesis, as out- 
lined in sec. 10.3.1 and 10.3.4. On the other hand, subsystem thermal- 
ization which, along with subsystem equilibration, has been mentioned 
several times in the above paragraphs, is based on a somewhat distinct 


set of considerations. 


While equilibration is a process of broad generality in many-body inter- 
acting systems, thermalization can be looked upon as a special case, 
though of almost comparable generality, conditional upon requirements 
on the initial state and the nature of the system (i.e., of its energy eigen- 


values and eigenstates). 


In the case of thermalization of an isolated system, the requirement on 
the system Hamiltonian is that of ‘quantum chaos’, though this term 
entails a degree of vagueness. One common characterization of quan- 
tum chaos is in terms of the statistics of the energy spectrum which is 
generally in the nature of the Wigner-Dyson distribution, with its atten- 
dant features of level repulsion and conformity with the non-degenracy 


and non-resonance conditions. The requirement of non-degeneracy of 
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eigenstates (degeneracies of relatively rare occurrence are admissible) 
contrasts with the case of equilibration where degenerate energy levels 
(arising due to the existence of a class of conserved operators) are not ex- 
cluded (this leads to dephased density matrices having a block diagonal 
form). Similarly, the non-resonance condition is imposed upon energy 
gaps relating to eigenstates rather than to gaps between energy levels 
(possibly degerate). In the literature, the terms ‘quantum chaos’ and 
‘quantum ergodicity’ are frequently used interchangeably. Once again, 
the distinction is not precise, with adequately sharp characterization of 
either. One hopes for a sharp differentiation to emerge, analogous to 
the distinction between classical ergodicity and mixing type dynamics, 


involving a sufficiently rapid decay of correlations. 


Likewise, the condition on the initial state is more stringent in the case 
of thermalization, as compared with equilibration. Focusing once again 
on the system as a whole (subsystem thermalization will be looked at in 
sec. 10.4.2.2 below), thermalization requires a narrow energy distribution 
of the initial state, which is to be confined to a sufficiently thin ‘energy 
shell’, and a sufficiently widely distributed participation of the energy 
levels within the shell. In contrast, equilibration is not conditional upon 
the energy distribution of the initial state being confined to a narrow 
energy window, though here also an adequately wide participation of the 


energy levels (possibly degenerate) is required. 


In the case of thermalization, one requires a broad participation of the en- 


ergy eigenstates, all within an energy shell, the energy levels being mostly 
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non-degenerate. On the other hand, equilibration also requires a broad par- 
ticipation of the energy levels, possibly degenerate, but ones not necessarily 


confined to an energy shell. 


As mentioned in sec. 10.4.1, the equilibrated state of the system as a 
whole, described by a GGE satisfying the maximum entropy principle, 
retains the memory of the initial state in the form of the initial values of 
the constants of motion. In the case of a system characterized by quan- 
tum chaos, if the energy distribution of the initial state is not confined to 
a narrow energy window, then the equilibrated state can be looked upon 
as a mixture of microcanonical ensembles distributed over numerous en- 
ergy shells, provided that the condition relating to the participation of the 
energy levels within each individual energy window is conformed to. Even 
so, the system retains some ‘memory’ of the initial state in the form of the 
energy distribution across the shells, though the distribution within each 


shell gets evened out during the process of equilibration. 


It is worthwhile to recall the difference between classical and quantum me- 
chanical systems in respect of the ‘energy’ of a state, such as the initial state 
in a non-equilibrium process. Classically, the system must be character- 
ized by some value of the energy, whatever be its past history. The fact that 
this value may not be precisely defined is nevertheless consistent with the 
state being defined within a narrow energy shell. In the quantum mechan- 
ical description, on the other hand, it is only the mean energy that can be 
precisely defined, while the energy distribution among eigenstates may cover 


numerous energy shells, depending on how the system was fed with energy 
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during its past evolution. For all practical purposes, however, the energy 
distribution indeed remains confined to an energy shell around the mean 


energy, provided the system is sufficiently large in size. 


With this background on thermalization in the larger context of equilibra- 


tion, we now focus on a few aspects of subsystem thermalization.. 


10.4.2.2 Subsystem thermalization 


We consider an isolated composite system (‘S’) made up of subsystems ‘A’ 
and ‘B’ where the former, the system of interest, is of a relatively small 
size compared to the latter, the complementary system in ‘S’ and referred 
to as the ‘bath’. However, in contrast to the standard approach where 
the bath is initiated in a thermal state and is assumed to be of very large 
size, having a quasi-continuous spectrum (refer back to sections 3.4.4 
and 8.7.2.2), similarly restrictive assumptions on the bath are not im- 


posed in the present approach. 


This has the advantage of avoiding circularity in the attempt at explaining 
thermalization: in the commonly adopted approach, the explanation of the 
canonical ensemble is based on the assumption that the composite system 
is in a thermal state described by a microcanonical ensemble, which leaves 
unexplained how, in the first place, the microcanonical ensemble comes 


about in the case of a closed system. 


The following features are assumed to be characteristic of subsystem 


thermalization (see [56] for details): (i) equilibration - in particular, we 
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will consider subsystem thermalization in the context of subsystem equi- 
libration on average and equilibration on average of the composite sys- 
tem; (ii) subsystem initial state independence — the equilibrated state has 
to be independent of the initial reduced state of the subsystem regardless 
of whether there exist local conserved quantities pertaining to the sub- 
system; (iii) bath state independence — one expects that the equilibrated 
thermal state of a sufficiently small subsystem should be independent 
of the details of the initial state of the rest of the system, and should 
depend instead on a small number of macroscopic parameters charac- 
terizing it, such as the energy density (this may, of course, depend on 
the initial state of the bath); (iv) diagonal form of the equilibrated state — 
there should exist some basis in the state space of the subsystem (and 
some associated Hamiltonian) in which the equilibrated state is diagonal; 
more precisely, the time-dependent basis in which the instantaneous re- 
duced state is diagonal approaches a stationary limit at sufficiently large 
times; (v) Gibbs state — the equilibrated state should appear as a Gibbs 
state corresponding to some appropriately defined temperature and the 
asymptotic stationary basis mentioned above, along with an asymptotic 


‘self-Hamiltonian’. 


It is the last feature in the above list that can be said to justify the canon- 
ical ensemble of equilibrium statistical mechanics without the necessity 
of a prior justification of the microcanonical ensemble, and to place the 


former on an equal footing to the latter. 


The above list of requirements is to be looked at as a set of sufficient con- 


ditions for the term ‘thermalization’ to be applicable. To proceed further, 
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we formulate an explicit (and sufficiently general) definition of thermal- 
ization (including subsystem thermalization along with thermalization of 
an isolated system as a whole) and then proceed to state (without proof!) 
a relevant result in respect of subsystem thermalization, indicating the 


initial conditions that make subsystem thermalization possible. 


The system under consideration, defined in terms of the Hamiltonian H, 
will be said to thermalize (on average) for a given set of initial states S if, 
for each fp in S it effectively equilibrates on average to the state w = Dfo 
(i.e., expectation values of observables in the evolving non-equilibrium 
state approach the corresponding expectation values in w) that is close to 


a thermal state of the form 


_BH 


6[H, 8] = ———, 
Tr(e-841) 


(10-86) 
where H is some appropriate Hamiltonian and (£ is some positive real 
number determined by Tr(H/). Here the measure of closeness between 
o and w is defined in terms of the distinguishability under a sufficiently 
large set of POVM’s [56] (alternatively, one can use the trace distance 
between operators). If one considers the set of all POVM’s with the mea- 
surement operators defined on a subsystem (‘A’), then this provides us 
with the definition of susbsystem thermalization, in which case we have 
=f a, the reduced Hamiltonian obtained by taking partial trace over 
the bath ‘B’. If, on the other hand, the POVM’s are supported on the com- 
posite system (‘S’) then a natural choice is H = H (one can also have 


6 =0, while H is the projection operator on a narrow energy shell). 
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Subsystem thermalization is a useful concept in the context of thermal- 
ization of quantum mechanical systems (as compared with thermaliza- 
tion of an isolated composite system) in the sense that, with an appro- 
priate restriction on the set of initial states (but one not unrealistic from 
a practical point of view) the possibility of thermalization is de-linked 
from the issue of integrability. Since ETH requires quantum chaos as a 
pre-requisite while, on the other hand, thermalization is found to occur 
for integrable systems as well, the idea of subsystem thermalization is a 


fruitful one. 


It is also to be mentioned that subsystem thermalization can be estab- 
lished by following the route of dynamical equilibration as in sec. 10.4.1 
or the one based on the idea of typicality (refer to sec. 10.4.3 below). Rig- 
orous proofs along either of the two routes are available in the literature 


under suitable restrictions on the class of admissible states. 


Incidentally, an acceptable proof establishing subsystem thermalization 
in a system (‘A’) weakly interacting with a bath (‘B’; the assumption of a 
weak interaction, which has been left unstated up to this point, is neces- 
sary to formulate a proof and to ensure that H introduced above is close 
to the reduced Hamiltonian H,) has to address properly the possibility 
that even a weak interaction may significantly alter the energy eigenstates 
of the non-interacting Hamiltonian and the projection operartors into the 
eigenspaces — the alteration in the eigenvalues being relatively small and 
devoid of substantive relevance. The alteration of the eigenstates becomes 


especially relevant in the case of a system exhibiting quantum chaos. 
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With this background in place, we paraphrase, avoiding technicalities, 
one of several possible formulations of the conditions leading to subsys- 


tem thermalization (see [56] for details). 


We consider a composite system (‘S’)) made up of a subsystem (‘A’, the 
system of interest) and a sufficiently large bath (the complementary sys- 
tem ‘B’). Let the bath be described by a locally interacting Hamiltonian 
(in general, characterized by quantum chaos) such that in a given en- 
ergy interval ranging from F to E +A (A << B&), the logarithm of the 
number of bath states (defined independently of the interaction with ‘A’ 
varies approximately linearly with F with a slope 3. Let, furthermore, the 
interaction between ‘A’ and ‘B’, represented by the Hamiltonian fj, be 
sufficiently weak in the sense that || A; ||<< 6~! << A (where || - || denotes 
the operator norm), and the energy interval mentioned above be far from 
the edges of the spectrum of the bath. Then the time evolution of the sys- 
tem results in the subsystem ‘A’ reaching a thermal state é[Ha, }] (refer 
to (10-86); here a a4 is the unperturbed Hamiltonian of ‘A’) in the sense 
of subsystem thermalization outlined above, for any initial state f) of a 
rectangular nature with respect to the energy interval from F to E+ A. In 
other words, the large time average of the trace distance between /(t)|, 
and 6[H,, 6] attains a sufficiently small value, as specified in [56]. Here 
p(t)|, stands for the reduced state (pertaining to ‘A’) of the evolving state 


A(t) of ‘S’. 


In this context, the term rectangular state needs explanation. A rectan- 
gular state with reference to an energy interval from, say, E to HE +A 


and to a Hamiltonian H defined on a Hilbert space H is one that leads to 
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the microcanonical ensemble defined over the same energy interval under 
the action of the dephasing map PD (refer to (10-81b)). The requirement 
of a rectangular initial state is somewhat restrictive but guarantees that 
subsystem thermalization does occur. This requirement can be slightly 
weakened provided it does not appreciably affect the expectation values 
of local observables. If, on the other hand, the deviation from a rectangu- 
lar state does affect the expectation value of a local observable then the 
dephased subsystem state also differs substantially from a thermal one. 
It appears that a relatively less restrictive initial state leads to thermaliza- 
tion only under additional conditions on the eigenstates, along the lines 
of the ETH. Thus, the ETH and the above conditions implying subsystem 
thermalization constitute, in a sense, competing frameworks explaining 


thermalization in quantum mechanical systems. 


The issue of subsystem thermalization will be addressed once again in 


the framework of dynamical typicality in sec. 10.4.3 below. 


10.4.2.3 ETH and thermalization: postscript 


The role of the ETH in thermalization has been explained in sec. 10.3. 
With the concept of thermalization as defined in sec. 10.4.2.2 above (refer 
back to the paragraph containing (10-86)), issues relating to subsystem 
thermalization can be addressed within the same framework as those on 
thermalization of a closed system as a whole. As already mentioned, the 
two approaches to thermalization within this common framework differ 
in respect of the specification of the class of initial states and also of 


the system as a whole (the energy eigenvalues and eigenstates, and their 
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degeneracy). Additionally, there are a number of technical issues relating 


to what exactly one means by thermalization. 


The fact that the ETH is a hypothesis, with no rigorous proof establishing 
it, has led to attempts at adding or taking away a few technical stipu- 
lations and then showing that the resulting definition applies to a suffi- 
ciently large class of systems and, finally, proving that ETH does lead to 
thermalization where, in the last-named endeavor, one focuses on ther- 
malization of an isolated system as a whole (to a microcanonical ensem- 
ble) or of a subsystem (to a canonical ensemble). There have been at- 
tempts at establishing sufficiency, or necessity, or both, of ETH as far 
as thermalization is concerned. A definitive and complete picture, how- 
ever, is yet to emerge. There is overwhelming evidence, though, that the 
ETH (in some form or other) is a very useful idea in the context of ther- 
malization (refer to [56], [143], [27], [26]; a number of statements and 
results from these sources have been mentioned in several sections of 


the present chapter). 


A notable exception to thermalization occurs in the case of systems ex- 
hibiting many-body localization([30], [116]). A stream of independent par- 
ticles flowing into a disordered medium (owing to the presence of impu- 
rities or to other structural reasons), fail to diffuse through it because of 
single-particle localization. Analogously, an interacting many-body sys- 
tem exhibits localization features in the presence of disorder, where ef- 
fects of quantum chaos are suppressed and thermalization fails to occur. 
The energy eigenstates remain localized when considered in an arbitrar- 


ily chosen generic basis (refer back to sec. 10.2.6) and subsystem density 
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matrices remain localized in the energy space. The localization of eigen- 
states leads to a clear violation of ETH, resulting in an absence of ther- 
malization, connected with the existence of local constants of motion. A 
number of such systems exhibit a phase transition between a disordered 
insulator and conducting phase. More generally, transport coefficients 


drop to zero due to the localization effect. 


10.4.3 Dynamical typicality 


Imagine a situation where one draws a state |v) from a collection of pure 
states of a system satisfying some specified condition (say,that of unit 
norm), in accordance with some specified probability distribution over 
this collection. Assume further that, for such a randomly selected pure 
state, some specified feature or property is found to hold with a very high 
probability — for instance, the expectation value of some observable A 
has some specified value. One then refers to the said property as being 


typical under the specified probability distribution. 


The idea of dynamical typicality is a related one — here the property in 
question involves the time evolution of the state selected. For instance, 
it may turn out that the time dependent expectation value of A tends, in 


the long run, to a certain limiting value. 


Statements and results based on dynamical typicality differ somewhat 
from those based on the quantum dynamics of density matrices. The 
latter, to be referred to as ‘quantum dynamical’ ones, have been made 


use of, in a large measure, in earlier sections on ETH, thermalization, 
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and equilibration. There is a long tradition in the literature, of attempts 
at deriving corresponding statements by making use of typicality-based 


arguments. 


An early instance of typicality-based argument justifying the microcanon- 
ical ensemble — one of major relevance in the literature — is met with in 
von Neumann’s derivation of the quantum ergodic theorem . Referring to 
von Neumann’s work, Goldstein et al [58] describe as ‘normal typicality’ 
the following behavior: for a ‘typical’ finite family of commuting macro- 
scopic observables, every initial wave function 7) from a microcanonical 
energy shell so evolves that for most times ¢ in the long run, the joint 
probability distribution of these observables obtained from 7, is close to 
their microcanonical distribution (refer back to sec. 10.3.2). In other 
words, most decompositions of the Hilbert space into subspaces (defined 
by common eigenvectors of the macroscopic observables) have the prop- 
erty that, for all initial pure states and most times during the time evo- 
lution, the evolving state and a relevant microcnonical distribution are 


approximately indistinguishable. 


Amore recent exposition of the typicality approach, similar in spirit to von 
Neumann’s, is to be found in [94]. Let us consider a uniform distribution 
over the collection of normalized pure states belonging to a subspace of 
large dimension D (typically, a microcanonical energy shell), i.e., a distri- 
bution that remains invariant under a unitary transformation within this 
subspace. Then the ‘Hilbert space average’ (HA) [4] of the expectation 


value (u|A|W) of an observable A, where |) is drawn at random from the 
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above distribution, is given by 


abies (10-87) 


D 9 
while the ‘Hilbert space variance’ (HV) of the expectation value of A over 


the same distribution is given by 


[(b|Alw) Juv =[((b Al) — [0b] Ale)]a) Tua 


1 Tr A? TrA 
Beart D — ( D ae (10-87b) 


Making use of bounds on (Tr)?A and TrA? in terms of the eigenvalue with 
the largest modulus (say, A), one concludes that the amount by which 
the expectation value of an observable on a typical state deviates from 
its expectation value averaged over all states is likely to be less than Ta 
times the eigenvalue with the maximum modulus, the latter quantity (i.e., 


the average over all states of the expectation value) being the same as the 


expectation value over the microcanonical ensemble. 


While the above typicality result is a kinematic one (i.e., does not refer 
explicitly to time evolution), it is closely related to dynamical typicality 
since the Hilbert space average gives the same effect as the one resulting 


from dephasing under appropriate non-degeneracy conditions. 


The uniform distribution of pure states in a subspace of dimension D is 
realized by fixing a basis |1),|2),--- ,|j),--- ,|D), expressing any pure state as 
|b) = Se c;|j), and choosing the 2D number of real numbers Rec;,Imc; (j = 


1,2,--- ,D), drawn randomly from a uniform distribution over the surface 
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of a 2D-dimensional unit sphere. This corresponds to the so-called Haar 
measure on the group of complex D x D unitary matrices. The term ‘Hilbert 
space average’ refers to the expectation value of an observable - taken on a 
state belonging to a given set of pure states — averaged over some specified 


probability distribution on that set of sets. 


A result based on dynamical typicality telling us how the ETH leads to the 
microcanonical ensemble has been derived in [118], where the focus is on 
the weak ETH (‘wETH) in contrast to the so-called strong version. Refer- 
ring to an energy shell with energy eigenvalues (F,, (n = 1,2,---)) ranging 
from EF to E +e (say), the wETH assumes that the expectation values of an 
observable A has very similar values for most of the eigenstates |n) in this 
energy range (with reference to the uniform distribution, defined above, 
of pure states within the shell; the strong version of ETH assumes that 
all the eigenstates within the shell have this property). Assuming, more- 
over, that the effective dimension defined in (10-82a) is sufficiently large, 
it follows that the vast majority of all pure states, which exhibit the same 
initial expectation value for some observable A, closely approach the per- 
tinent microcanonical expectation value of A for practically all sufficiently 


large times. 


A number of issues of basic relevance in the statistical mechanics of iso- 
lated macroscopic quantum mechanical systems are addressed and re- 
viewed in the light of the idea of typicality in [135], principally by way of 
justifying the microcanonical ensemble along the lines followed, among 


others, in [59]. 
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Canonical typicality. 


We now turn to a brief outline of what is referred to as canonical typi- 
cality. The dynamical emergence of the canonical ensemble for a sub- 
system of a composite system under unitary evolution of the latter has 
been mentioned above in sec. 10.4.2.2. As mentioned earlier, the tra- 
ditional approach to subsystem thermalization arrives at the canonical 
distribution of a subsystem by reduction form the microcanonical dis- 
tribution of a bigger composite system where the question as to how the 
microcanonical distribution is realized in the first place, is not addressed. 
There exists, however, a history of looking at the problem of subsystem 
thermalization in the more general context of non-equilibrium evolution 
of the composite system (refer back to sec. 10.4.2.2). We now refer to a 
few results on the complementary approach to the same problem, based 
on the approach of typicality. As mentioned above, the role of dephasing 
under non-equilibrium evolution is taken up, in the typicality-based ap- 
proach, by Hilbert space averaging and the associated feature of measure 


concentration. 


For instance, [57] generalizes the traditional approach by showing that 
the reduced density matrix obtained from a pure state, drawn at random 
from a uniform distribution within a microcanonical energy shell is likely 
to be close to the canonical distribution obtained from the corresponding 
microcanonical ensemble, the former approaching the latter more and 
more closely as the composite system (made up of the system of interest 
and a bath) approaches the thermodynamic limit. Here the term ‘likely 


to be close’ is to be interpreted in the sense of average over the uniform 
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distribution over pure states within the energy shell. In this case, one 
need not consider the long term time evolution because of the fact that 
the averaging with respect to the uniform measure over the pure states 


has the same effect as that of the dephasing under time evolution. 


The approach of dynamical typicality is adopted in [119] to show how 
subsystem thermalization comes to be realized in a closed many-body 
quantum system such that the time-dependent reduced state f(t) of the 
subsystem (‘A’) relaxes for most time t to a Gibbs state f¢(2) (with inverse 
temperature {, related to the density of bath states) of the same sub- 
system under unitary evolution of the composite system (made up of ‘A’ 
and the bah ‘B’)). Here the term ‘relaxes to’ means that the trace distance 
D(fa(t), Pa(2)), which measures the distinguishability between two states, 


remains at a sufficiently small value. 


The Hamiltonian H of the composite system can be expressed as a sum 


H=H)+V 


=H,+Hp+V, (10-88) 


where the decomposition into the subsystem Hamiltonian H,, the bath 
Hamiltonian Hy, and the interaction V is, to sum extent, arbitrary. This 
arbitrariness is made use of in defining an optimal splitting into the three 
parts so as to devise an effective perturbation scheme that works even in 
the thermodynamic limit (when the energy levels come arbitrarily close 
together and the eigenvectors tend to get modified in an uncontrolled 


manner). The Hmiltonian H,H) are assumed to be non-degenerate, with 
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non-degenerate energy gaps, so that long term time evolution of an initial 


pure state |7) (| leads to the dephased state w. 


Of special significance in the analysis are the so-called rectangular states 
referred to in sec. 10.4.2.2, a rectangular state being one having a flat en- 
ergy distribution within some specified microcanonical energy shell from, 
say, E to E+ A. We denote such a rectangular state by a symbol of the 
form #w. Techniques relating to measure concentration in subspaces 
of large dimension lead to the result that, for a typical state |w)(v| ina 
microcanonical subspace gives a reduced state f, that is close to the re- 
duction (#w,) of the corresponding microcanonical state. Equivalently, 
under the non-degeneracy assumption, one can show that the time evo- 
lution of the reduced state /, (for a sufficiently small subsystem ‘A’) tends 


to the dephased state Dj. 


As mentioned above, the principal problem in the analysis relates to the 
proper formulation of a perturbation scheme for a weak coupling since 
the energy levels for a large system are dense, as a result of which it be- 
comes a problem to limit the interaction energy to within a sufficiently 
small value. For this, the microcanonical state #w is compared with 
#0), the dephased state corresponding to the non-interacting Hamilto- 
nian Ho, from which we obtain the reduced state Ho), thereby obtaining 
a controlled approximation to #w#,. Indeed, for an exponential density of 
states for which one hopes to get equilibration to the Gibbs state 6,(() for 
the subsystem, the trace distance between #wW, and Hi remains small 
a 


whenever || V ||< z- Furthermore, it turns out that, under a reasonable 


condition on the density of bath states, the trace distance between Ho) 
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and the subsystem Gibbs state fc¢(3) remains small. Bringing together 
these two results on trace distance, and the assumption of randomness 
of the initial state |W) under the unitarily invariant measure defined on 
the microcanonical subspace for the composite system, along with results 
on dynamical equilibration (sec. 10.4.1), one finally arrives ([119]) at the 


emergence of the subsystem Gibbs state. 


10.5 Equilibrium and non-equilibrium statisti- 
cal mechanics: concluding observations 


I have, in this book, attempted to cover a wide terrain in putting together 
the basic concepts and principles of equilibrium and non-equilibrium sta- 
tistical mechanics and integrating those into a coherent whole. At the 
end, I recall, by way of summary, the observation made by Feynman 
(sec. 1.1.2) that the subject of statistical mechanics involves a ‘climb-up’ 
and a ‘slide-down’, with a ‘summit’ in between — that summit being the 
formula describing any one of the equilibrium ensembles of statistical 


mechanics. 


The slide-down consists of a vast number of applications, based on di- 
verse approximation schemes, where the ensembles of equilibrium sta- 
tistical mechanics are invoked to explain and predict the behavior of 
model systems, all supposed to represent real-life systems, while ignoring 


inessential complexities of the latter. 


The choice of a model involves an unavoidable element of judgment,though 
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that judgment is mostly based on a large number of subtle and nuanced 
considerations, involving comparisons with experimental data, and is far 
from being arbitrary. The justification of choosing a model depends on the 


extent to which it represents the observed behavior of real-life systems. 


The principal endeavor of statistical mechanics is to explain the ther- 
modynamic behavior of macroscopic systems. In this the fundamental 
distinction between the microscopic and macroscopic descriptions of a 
system is of relevance. This distinction applies equally to equilibrium and 
non-equilibrium thermodynamics (indeed, thermodynamics refers only to 
the macroscopic description) and is of basic relevance in statistical me- 
chanics. In this book, we have adopted the approach of linking the two 
descriptions by means of the probability distribution over pure states 
that may either pertain to an equilibrium ensemble or may be an evolving 
one. At the microscopic level, the time evolution has to be compatible 
with the Hamiltonian of the system under consideration (along with the 
Hamiltonian operators of other interacting systems), while at the macro- 
scopic level, the probabilty distribution has to yield appropriate values of 
the relatively small number of thermodynamic variables as determined 


by the constraints imposed on the system. 


Strictly speaking, the distinction between the microscopic and the macro- 
scopic descriptions is meaningful only for an infinitely large system in re- 
spect of which the thermodynamic principles can be recovered from the 
equilibrium ensembles, where the fundamental thermodynamic feature 


consists of the functional dependence of the entropy (or other thermo- 
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dynamic potentials such as the free energy) on the other macroscopic 
variables — U,V, N in the case of a simple fluid, the hall-mark of this func- 
tional relation being the concavity of S (analogous features hold for the 
other thermodynamic potentials, obtained by Legendre transformations), 
which is central to the establishment of the thermodynamic limit. Corre- 
spondingly, the fundamental feature of the equilibrium ensembles is the 


variational principle satisfied by these. 


The distinction between the microscopic and the macroscopic descrip- 
tions remains of fundamental relevance in explaining the principles of 
non-equilibrium thermodynamics which apply only to near-equilibrium 
phenomena where the idea of local equilibrium is meaningful, and where 
the equilibrium ensembles are successfully made use of in explaining the 
non-equilibrium phenomena. In this, the fluctuation-dissipation theo- 
rem constitutes the extension of equilibrium statistical mechanics into 
the linear response regime. It is in this regime that the principles of 
non-equilibrium thermodynamics are meaningfully interpreted in terms 


of non-equilibrium statistical mechanics. 


The question arises as to how far the distinction between the microscopic 
and the macroscopic descriptions remains meaningful beyond the near- 
equilibrium regime. Indeed, in the far-from-equilibrium situation the 
thermodynamic variables describing the local densities of conserved or 
near-conserved quantities lose relevance and, significantly, the concept 
of entropy becomes doubtful. While there have been attempts at extend- 
ing the definition of entropy to far-from-equilibrium situations, these are 


yet to give rise to a cogent and coherent theoretical concept. On the 
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other hand, it appears that the thermodynamic concept of entropy pro- 
duction rate (‘entropy production in brief) survives the change of scenario 
from the near-equilibrium to the far-from-equilibrium phenomena and 
continues to provide the quantitative indicator of irreversibility in non- 
equilibrium processes, especially those pertaining to non-equilibrium steady 


states (NESS). 


In order to set up a steady state, the system under consideration is to be 
made to interact with external systems that set up fluxes through it. The 
effect of the external systems (commonly, large reservoirs, referred to as 
baths) can be simulated by appropriately contrived thermostats that mod- 
ify the dynamical equations of the system away from Hamiltonian ones 
(an alternative approach retains the Hamiltonian equations while intro- 
ducing the effect of the external systems by means of appropriate bound- 
ary conditions). In this case, the entropy production is associated with 
the phase space contraction rate — ultimately linked with the microscopic 
description of the system that provides for dissipation and irreversibil- 
ity at the macroscopic level. The phase space contraction results in an 
asymptotic natural measure (starting from an initial probability measure) 
that resides on a fractal attractor. This, in turn, is indicative of dynamical 
chaos characterizing the system under consideration. In other words, dy- 
namical chaos (minimally, of the Axiom-A type) at the microscopic level, 
resulting in phase space contraction, is manifested as dissipation and 
irreversibility at the macroscopic level, expressed quantitatively in terms 


of the entropy production. 


In contrast, dissipative processes in an isolated system pose problems 
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of a different nature. It was the isolated system that Boltzmann was 
specifically concerned about. What is the mechanism by which an iso- 
lated system approaches the state of equilibrium, defined in macroscopic 
terms, and how is the irreversibility of such an approach to be accounted 
for in microscopic terms? One needs to answer these questions without 
invoking external systems such as thermostats, and to confine oneself to 
an explanation in terms of the Liouvillian dynamics. One can even extend 
those considerations to non-equilibrium steady states by invoking appro- 
priate flux boundary conditions while still describing the local dynamics 


in the phase space in Liouvillian terms. 


The approach to equilibrium in an isolated system made up of a large 
number of particles (V — oo, we consider a simple fluid as the system of 
interest) involves several spatial and temporal scales. The terminal stage 
of the process involves the hydrodynamic modes that can be described in 
terms of local thermodynamic variables satisfying the balance equations 
of hydrodynamics. The latter can be interpreted within the framework 
of linear response theory of non-equilibrium statistical mechanics. Sig- 
nificantly, the hydrodynamic modes are associated with Ruelle-Pollicott 
resonances in the spectral decomposition of the correlation functions, 
which imply a temporal decay of correlations as implied by mixing dynam- 
ics. The eigenfunctions of the Frobenius-Perron operator associated with 
the hydrodynamic modes exhibit fractal features. Analogous statements 
hold in respect of the non-equilibrium steady states close to equilibrium, 
in respect of which one can establish the existence of a positive entropy 


production as a quantitative indicator of dissipation and irreversibility. 
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For irreversible processes of a more general description, the irreversibil- 
ity is expressed quantitatively in terms of the Evans-Searles dissipation 
function, which establishes link with the microscopic description and re- 
lates to the phase space contraction. All this is once again indicative of 


dynamical chaos at the microscopic level. 


The relevance of chaos is also in evidence in the quantum mechanical for- 
mulation of statistical mechanics, where the random matrix theory (RMT) 
and the eigenstate thermalization hypothesis (ETH) — the latter built upon 
the foundation provided by the former - are found to account for ther- 
malization of an isolated system and also for the emergence of thermal 
states in subsystems (of a composite system) under quite general condi- 
tions on the state of the composite system. The ETH is a generalization of 
Berry’s conjecture that tells us that, in a certain well-defined sense, the 
Wigner representation of the energy eigenstates of a large system, whose 
classical analog is chaotic, resemble the microcanonical distribution in 


the semi-classical limit. 


Thus, dynamical chaos seems to be of central relevance in statistical me- 
chanics, especially with regard to non-equilibrium processes, where the 
latter are characterized by dissipation and irreversibility. As for the equi- 
librium configurations, all one needs is ergodicity, which does not nec- 
essarily require dynamical chaos. The variational principle satisfied by 
the equilibrium ensembles implies that non-equilibrium states close to 
equilibrium — ones that can be interpreted as constrained equilibria — are 
all of a lower entropy (in the description in terms of the microcanonical 


ensemble) compared to the equilibrium state but, strictly speaking, this 
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does not imply an approach to the said equilibrium state through a suc- 
cession of states of progressively increasing entropy. This is where the 
mixing property of the microscopic dynamics becomes relevant where the 
monotonic approach (exponential or algebraic) towards the equilibrium 
state, attended with strictly positive entropy production, emerges as a 


consequence. 


A similarly crucial role of dynamical chaos also seems to be indicated 
in the context of non-equilibrium steady far from equilibrium where the 
Axiom-A dynamics appears to be a minimal requirement. A question that 
is often raised relates to the possible role of large N in the emergence 
of irreversibility. Is the large-N condition sufficient in explaining irre- 


versibility? 


This appears to be the wrong way of posing the problem since it reduces 
the question of irreversibility to a dichotomous choice between ‘large- 
N’ or ‘chaos’. As has been stated earlier, the very idea of irreversibility 
is founded upon the distinction between the microscopic and and the 
macroscopic descriptions of a system which, strictly speaking, is mean- 
ingful in the N — oo limit (in the case of finite NV, one can always ask 
the question, ‘how large is large?’). Looked at from another point of view, 
chaos and irreversibility may well be an emergent property of a system 
made up of an enormously large number of particles. More specifically, 
despite the KAM theorem, integrable dynamics is fragile under small per- 
turbations since a fraction of the KAM tori break up because of a violation 
of the non-resonance condition, as a result of which there opens up gaps 


in between the KAM tori through which trajectories diffuse in the phase 
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space — a process referred to as Arnold diffusion (see for instance, [90]). 


Thus, considering a small perturbation of strength, say «, over an inte- 
grable system of NV number of degrees of freedom, there exists numerical 
evidence that, for a fixed «, the system displays regular motion in only a 
small fraction of the volume of the phase space, which vanishes exponen- 
tially with N (refer to [35], where a large number of coupled symplectic 
maps are considered, with a small value of the coupling strength). This 
picture of emergent chaos for systems with a large dimension of the phase 
space is, however, not a simple one, since the time scale of diffusion in 
the phase space has been shown to be exponentially large in t for a fixed 
N. The complexity of this picture, referred to as the ‘Nekhoroshev sce- 
nario’, consists of the dependence of the diffusion time scale on N for 
fixed «, where one finds that an enormously large values of N signifi- 
cantly enhances the diffusion rate. What is more, the coupling strengths 
for real-life systems are often of considerable magnitude, which alters 
the Nekhoroshev scenario in favour of large scale diffusion in the phase 
space. On the whole, however, the idea of emergent chaos in the large- 
N limit is not a rigorously founded one and is in the nature of a likely 


possibility. 


Having seen that a dichotomous choice of the type ‘chaos or large-N’ is 
too simplistic to lay an adequate foundation for equilibrium and non- 
equilibrium statistical mechanics (including, in particular, an explana- 
tion of the approach to equilibrium in an isolated system), it is necessary 
to observe that the scenario based on Arnold diffusion and the Nekhoro- 


shev estimate providing a lower bound to the diffusion time scale pertains 
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to the classical description, while an analogous quantum framework is 
lacking. It nevertheless seems likely that systems made up of an enor- 
mously large number of particles satisfy the RMT and the eigenstate ther- 
malization hypothesis regardless of their constitution. In other words, 
‘large-N’ and ‘chaos’ appear to make up a complex composite picture, a 
large part of which is yet to be explored in elucidating the foundations of 


statistical mechanics. 


Finally, there remains the question of the possible role of external noise 
in the thermalization of an isolated system when one recognizes that, in 
reality, perfect isolation is not possible and that, in the large-N limit, even 
an arbitrarily small external influence can be inordinately effective (refer 
to [15], [85] for the basic idea involved) in causing transitions between 
the energy levels within a narrow energy window of the system. While the 
ETH highlights the possibility of thermalization as an intrinsic process, 
noise-driven thermalization remains an equally likely possibility. Indeed, 
the Langevin dynamics itself (or, more generally, the reduced dynamics 
described by a master equation) provides a model for such noise-driven 


thermalization. 


This book is to come to an end at this point since we are now being too 
vague and speculative in looking for foundational principles of statistical 
mechanics — those relevant in the context of the ‘climb-up’ mentioned by 


Feynman. 
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